Marek Polacek [Wed, 9 Sep 2020 17:49:26 +0000 (13:49 -0400)]
testsuite: Move auto-96647.C to c++1y/.
This test uses a C++14 feature so fails with -std=c++11. Therefore
I've moved it to cpp1y/ and used target c++14.
gcc/testsuite/ChangeLog:
* g++.dg/cpp0x/auto-96647.C: Moved to...
* g++.dg/cpp1y/auto-96647.C: ...here. Use target c++14.
H.J. Lu [Wed, 9 Sep 2020 17:29:47 +0000 (10:29 -0700)]
x32: Update gcc.target/i386/builtin_thread_pointer.c
Update gcc.target/i386/builtin_thread_pointer.c for x32. For
int
foo3 (int i)
{
int* p = (int*) __builtin_thread_pointer ();
return p[i];
}
we can't generate:
movl %fs:0(,%edi,4), %eax
ret
for x32 since the address of %fs:0(,%edi,4) is %fs + zero-extended to 64
bits of 0(,%edi,4). Instead, we generate:
movl %fs:0, %eax
movl (%eax,%edi,4), %eax
PR target/96955
* gcc.target/i386/builtin_thread_pointer.c: Update scan-assembler
for x32.
H.J. Lu [Tue, 8 Sep 2020 12:54:56 +0000 (05:54 -0700)]
libphobos: Include <cet.h> to generate the CET marker for -fcf-protection
Include <cet.h> to generate the CET marker for -fcf-protection to avoid
/bin/ld: ../libdruntime/.libs/libgdruntime_convenience.a(libgdruntime_convenience_la-switchcontext.o): error: missing IBT and SHSTK properties
when -z cet-report=error is passed to the linker to create libgphobos.so
and libgdruntime.so.
PR d/95680
* libdruntime/config/x86/switchcontext.S: Include <cet.h> to
generate the CET marker for -fcf-protection.
Tom de Vries [Wed, 9 Sep 2020 16:43:13 +0000 (18:43 +0200)]
[nvptx, libgcc] Fix Wbuiltin-declaration-mismatch in atomic.c
When building for target nvptx, we get this and similar warnings for libgcc:
...
src/libgcc/config/nvptx/atomic.c:39:1: warning: conflicting types for \
built-in function ‘__sync_val_compare_and_swap_1’; expected \
‘unsigned char(volatile void *, unsigned char, unsigned char)’ \
[-Wbuiltin-declaration-mismatch]
...
Fix this by making sure in atomic.c that the pointers used are of type
'volatile void *'.
Tested by rebuilding atomic.c.
libgcc/ChangeLog:
* config/nvptx/atomic.c (__SYNC_SUBWORD_COMPARE_AND_SWAP): Fix
Wbuiltin-declaration-mismatch.
Segher Boessenkool [Fri, 7 Aug 2020 01:31:38 +0000 (01:31 +0000)]
bb-reorder: Remove a misfiring micro-optimization (PR96475)
When the compgotos pass copies the tail of blocks ending in an indirect
jump, there is a micro-optimization to not copy the last one, since the
original block will then just be deleted. This does not work properly
if cleanup_cfg does not merge all pairs of blocks we expect it to. It
also does not work if that last block can be merged into multiple
predecessors.
2020-09-09 Segher Boessenkool <segher@kernel.crashing.org>
PR rtl-optimization/96475
* bb-reorder.c (maybe_duplicate_computed_goto): Remove single_pred_p
micro-optimization.
Nick Clifton [Wed, 9 Sep 2020 14:54:20 +0000 (15:54 +0100)]
If the lto plugin encounters a file with multiple symbol sections, each of which also has a v1 symbol extension section[1] then it will attempt to read the extension data for *every* symbol from each of the extension sections. This results in reading off the end of a buffer with the associated memory corruption that that entails. This patch fixes that problem.
2020-09-09 Nick Clifton <nickc@redhat.com>
* lto-plugin.c (struct plugin_symtab): Add last_sym field.
(parse_symtab_extension): Only read as many entries as are
available in the buffer. Store the data read into the symbol
table indexed from last_sym. Increment last_sym.
Tom de Vries [Wed, 9 Sep 2020 13:37:58 +0000 (15:37 +0200)]
[nvptx] Fix Wformat in nvptx_assemble_decl_begin
I'm running into this warning:
...
src/gcc/config/nvptx/nvptx.c: In function \
‘void nvptx_assemble_decl_begin(FILE*, const char*, const char*, \
const_tree, long int, unsigned int, bool)’:
src/gcc/config/nvptx/nvptx.c:2229:29: warning: format ‘%d’ expects argument \
of type ‘int’, but argument 5 has type ‘long unsigned int’ [-Wformat=]
elt_size * BITS_PER_UNIT);
^
...
which I seem to have introduced in commit
b9c7fe59f9f "[nvptx] Fix array
dimension in nvptx_assemble_decl_begin", but not noticed due to configuring
with --disable-build-format-warnings.
Fix this by using the appropriate format.
Rebuild cc1 on nvptx.
gcc/ChangeLog:
* config/nvptx/nvptx.c (nvptx_assemble_decl_begin): Fix Wformat
warning.
Patrick Palka [Wed, 9 Sep 2020 13:21:09 +0000 (09:21 -0400)]
c++: Fix resolving the address of overloaded pmf [PR96647]
In resolve_address_of_overloaded_function, currently only the second
pass over the overload set (which considers just the function templates
in the overload set) checks constraints and performs return type
deduction when necessary. But as the testcases below show, we need to
do the same when considering non-template functions during the first
pass.
gcc/cp/ChangeLog:
PR c++/96647
* class.c (resolve_address_of_overloaded_function): Check
constraints_satisfied_p and perform return-type deduction via
maybe_instantiate_decl when considering non-template functions
in the overload set.
* cp-tree.h (maybe_instantiate_decl): Declare.
* decl2.c (maybe_instantiate_decl): Remove static.
gcc/testsuite/ChangeLog:
PR c++/96647
* g++.dg/cpp0x/auto-96647.C: New test.
* g++.dg/cpp0x/error9.C: New test.
* g++.dg/cpp2a/concepts-fn6.C: New test.
Richard Biener [Wed, 9 Sep 2020 11:58:45 +0000 (13:58 +0200)]
fix useless unsharing of SLP tree
This avoids unsharing the SLP tree when optimizing load permutations
for reductions but there is no actual permute taking place.
2020-09-09 Richard Biener <rguenther@suse.de>
* tree-vect-slp.c (vect_attempt_slp_rearrange_stmts): Do
nothing when the permutation doesn't permute.
Tom de Vries [Wed, 9 Sep 2020 07:51:43 +0000 (09:51 +0200)]
[nvptx] Fix boolean type test in write_fn_proto
When running this libgomp testcase for nvptx accelerator:
...
/* { dg-do run } */
__uint128_t v;
int main () {
#pragma omp target
{
__uint128_t exp = 2;
__atomic_compare_exchange_n (&v, &exp, 7, false, __ATOMIC_RELEASE,
__ATOMIC_ACQUIRE);
}
}
...
we run into this assert in write_fn_proto:
...
913 gcc_assert (type == boolean_type_node);
...
This happens when doing some special-handling code for
__atomic_compare_exchange_1/2/4/8/16. The function decls have a parameter
called weak of type bool, which is skipped when writing the decl because
the corresponding libatomic functions do not have that parameter. The assert
is there to verify that we skip the correct parameter.
However, we assert because we have different type of bools:
...
(gdb) call debug_generic_expr (type)
_Bool
(gdb) call debug_generic_expr (global_trees[TI_BOOLEAN_TYPE])
bool
...
Fix this by checking for TREE_CODE (type) == BOOLEAN_TYPE instead.
Tested libgomp on x86_64-linux with nvptx accelerator.
Likewise, tested that the test-case above does not ICE anymore.
gcc/ChangeLog:
PR target/96991
* config/nvptx/nvptx.c (write_fn_proto): Fix boolean type check.
Richard Biener [Wed, 9 Sep 2020 10:05:55 +0000 (12:05 +0200)]
enable live comparison vectorization
This removes a check preventing vectorization of live results of
vectorized comparisons. I tested it with AVX512 mask registers
(inspecting assembly) and traditional vector masks.
2020-09-09 Richard Biener <rguenther@suse.de>
* tree-vect-stmts.c (vectorizable_comparison): Allow
STMT_VINFO_LIVE_P stmts.
* gcc.dg/vect/vect-live-6.c: New testcase.
Tobias Burnus [Wed, 9 Sep 2020 09:44:55 +0000 (11:44 +0200)]
gfortran.dg/gomp/combined-if.f90: Update nvptx tree-dump times
nvptx has additional omp simd lines with _simt_ with -O1 and higher.
gcc/testsuite/ChangeLog:
* gfortran.dg/gomp/combined-if.f90: Update scan-tree-dump-times for
'omp simd.*if' for nvptx even more.
Richard Biener [Wed, 9 Sep 2020 08:36:46 +0000 (10:36 +0200)]
enable live condition vectorization
This removes a check preventing vectorization of live results of
vectorized conditions.
2020-09-09 Richard Biener <rguenther@suse.de>
* tree-vect-stmts.c (vectorizable_condition): Allow
STMT_VINFO_LIVE_P stmts.
* gcc.dg/vect/vect-cond-13.c: New testcase.
* gcc.target/i386/pr87007-4.c: Adjust.
* gcc.target/i386/pr87007-5.c: Likewise.
Rainer Orth [Wed, 9 Sep 2020 09:02:01 +0000 (11:02 +0200)]
config: Sync largefile.m4 from binutils-gdb
The following patch improves handling of largefile support with procfs
on 32-bit Solaris. It has already been approved and installed for
binutils-gdb in the thread starting at
[PATCH] Unify Solaris procfs and largefile handling
https://sourceware.org/pipermail/gdb-patches/2020-June/169977.html
I'm syncing the config/largefile.m4 part to gcc now which is the master
for config. Since ACX_LARGEFILE isn't used anywhere in the gcc tree,
I'm installing it as obvious.
2020-09-09 Rainer Orth <ro@CeBiTec.Uni-Bielefeld.DE>
config:
* largefile.m4: Sync from binutils-gdb.
Richard Biener [Wed, 9 Sep 2020 07:45:29 +0000 (09:45 +0200)]
tree-optimization/96978 - fix fallout of BB vectorization of live stmts
This avoids looking at STMT_VINFO_LIVE_P when vectorizing BBs.
2020-09-09 Richard Biener <rguenther@suse.de>
PR tree-optimization/96978
* tree-vect-stmts.c (vectorizable_condition): Do not
look at STMT_VINFO_LIVE_P for BB vectorization.
(vectorizable_comparison): Likewise.
liuhongt [Tue, 8 Sep 2020 07:44:58 +0000 (15:44 +0800)]
Implement __builtin_thread_pointer for x86 TLS.
gcc/ChangeLog:
PR target/96955
* config/i386/i386.md (get_thread_pointer<mode>): New
expander.
gcc/testsuite/ChangeLog:
* gcc.target/i386/builtin_thread_pointer.c: New test.
Tobias Burnus [Wed, 9 Sep 2020 07:33:51 +0000 (09:33 +0200)]
Fortran: Fixes for OpenMP loop-iter privatization (PRs 95109 + 94690)
This commit also fixes a gfortran.dg/gomp/target1.f90 regression;
target1.f90 tests the resolve.c and openmp.c changes.
gcc/fortran/ChangeLog:
PR fortran/95109
PR fortran/94690
* resolve.c (gfc_resolve_code): Also call
gfc_resolve_omp_parallel_blocks for 'distribute parallel do (simd)'.
* openmp.c (gfc_resolve_omp_parallel_blocks): Handle it.
(gfc_resolve_do_iterator): Remove special code for SIMD, which is
not needed.
* trans-openmp.c (gfc_trans_omp_target): For TARGET_PARALLEL_DO_SIMD,
call simd not do processing function.
gcc/testsuite/ChangeLog:
PR fortran/95109
PR fortran/94690
* gfortran.dg/gomp/combined-if.f90: Update scan-tree-dump-times for
'omp simd.*if'.
* gfortran.dg/gomp/openmp-simd-5.f90: New test.
Ian Lance Taylor [Wed, 9 Sep 2020 02:21:54 +0000 (19:21 -0700)]
libbacktrace: don't strip leading underscore on 64-bit PE
* pecoff.c (coff_initialize_syminfo): Add is_64 parameter.
(coff_add): Determine and pass is_64.
Ian Lance Taylor [Wed, 9 Sep 2020 02:09:21 +0000 (19:09 -0700)]
libbacktrace: fetch executable path on macOS
PR libbacktrace/96973
* fileline.c (macho_get_executable_path): New static function.
(fileline_initialize): Call macho_get_executable_path.
Ian Lance Taylor [Wed, 9 Sep 2020 01:18:48 +0000 (18:18 -0700)]
libbacktrace: avoid ambiguous binary search
Searching for a range match can cause the search order to not match
the sort order, which can cause libbacktrace to miss matching entries.
Allocate an extra entry at the end of function_addrs and unit_addrs vectors,
so that we can safely compare to the next entry when searching.
Adjust the matching code accordingly.
Fixes https://github.com/ianlancetaylor/libbacktrace/issues/44.
* dwarf.c (function_addrs_search): Compare against the next entry
low address, not the high address.
(unit_addrs_search): Likewise.
(build_address_map): Add a trailing unit_addrs.
(read_function_entry): Add a trailing function_addrs.
(read_function_info): Likewise.
(report_inlined_functions): Search backward for function_addrs
match.
(dwarf_lookup_pc): Search backward for unit_addrs and
function_addrs matches.
GCC Administrator [Wed, 9 Sep 2020 00:16:29 +0000 (00:16 +0000)]
Daily bump.
Ian Lance Taylor [Tue, 8 Sep 2020 22:07:24 +0000 (15:07 -0700)]
libbacktrace: fix tipo in comment
* simple.c (simple_unwind): Correct comment spelling.
Ian Lance Taylor [Tue, 8 Sep 2020 21:50:32 +0000 (14:50 -0700)]
libbacktrace: correct memory lengths in Mach-O dsym support
* macho.c (macho_add_dsym): Make space for '/' in dsym. Use
correct length when freeing diralc.
Julian Brown [Mon, 7 Sep 2020 18:43:16 +0000 (11:43 -0700)]
openacc: Fix atomic_capture-2.c iteration-ordering issues
The test case was written with assumptions about loop iteration ordering
that are not guaranteed by OpenACC and do not apply on all targets,
in particular AMD GCN. This patch removes those assumptions.
2020-09-08 Julian Brown <julian@codesourcery.com>
libgomp/
* testsuite/libgomp.oacc-c-c++-common/atomic_capture-2.c: Remove
iteration-ordering assumptions.
Julian Brown [Mon, 10 Feb 2020 20:26:57 +0000 (12:26 -0800)]
amdgcn: Add waitcnt after LDS write instructions
Data-share write (ds_write) instructions do not necessarily complete
the write to LDS immediately. When a write completes, LGKM_CNT is
decremented. For now, we wait until LGKM_CNT reaches zero after each
ds_write instruction.
This fixes a race condition in the case where LDS is read immediately
after being written. This can happen with broadcast operations.
2020-09-08 Julian Brown <julian@codesourcery.com>
gcc/
* config/gcn/gcn-valu.md (scatter<mode>_insn_1offset_ds<exec_scatter>):
Add waitcnt.
* config/gcn/gcn.md (*mov<mode>_insn, *movti_insn): Add waitcnt to
ds_write alternatives.
Julian Brown [Fri, 26 Jun 2020 16:07:58 +0000 (09:07 -0700)]
openacc: Fix mkoffload SGPR/VGPR count parsing for HSACO v3
If an offload kernel uses a large number of VGPRs, AMD GCN hardware may
need to limit the number of threads/workers launched for that kernel.
The number of SGPRs/VGPRs in use is detected by mkoffload and recorded in
the processed output. The patterns emitted detailing SGPR/VGPR occupancy
changed between HSACO v2 and v3 though, so this patch updates parsing
to account for that.
2020-09-08 Julian Brown <julian@codesourcery.com>
gcc/
* config/gcn/mkoffload.c (process_asm): Initialise regcount. Update
scanning for SGPR/VGPR usage for HSACO v3.
Julian Brown [Thu, 25 Jun 2020 14:40:53 +0000 (07:40 -0700)]
openacc: Fix race condition in Fortran loop collapse tests
The gangs participating in a gang-partitioned loop are not all guaranteed
to complete before some given gang continues to execute beyond that loop.
This means that two existing test cases contain a race condition,
because a loop that may be gang-partitioned is followed immediately by
another loop. The fix is to place the loops in separate parallel regions.
2020-09-08 Julian Brown <julian@codesourcery.com>
libgomp/
* testsuite/libgomp.oacc-fortran/collapse-1.f90: Fix race condition.
* testsuite/libgomp.oacc-fortran/collapse-2.f90: Likewise.
Ian Lance Taylor [Tue, 8 Sep 2020 20:16:44 +0000 (13:16 -0700)]
libbacktrace: correctly swap Mach-O 32-bit file offset
libbacktrace/ChangeLog:
PR libbacktrace/96973
* macho.c (macho_add_fat): Correctly swap 32-bit file offset.
Ian Lance Taylor [Tue, 8 Sep 2020 19:51:07 +0000 (12:51 -0700)]
libbacktrace: only match magic number at start of line
libbacktrace/ChangeLog:
PR libbacktrace/96971
* filetype.awk: Only match magic number at start of line.
Felix Willgerodt [Mon, 17 Aug 2020 11:36:49 +0000 (13:36 +0200)]
floatformat.h: Add bfloat16 support.
This change is motivated by a patchset that adds bfloat16 debugging
support for new avx512 instructions to GDB. The gdb thread can be found
here: https://sourceware.org/pipermail/gdb-patches/2020-July/170820.html
include:
2020-08-17 Felix Willgerodt <felix.willgerodt@intel.com>
* floatformat.h (floatformat_bfloat16_big): New.
(floatformat_bfloat16_little): New.
libiberty:
2020-08-17 Felix Willgerodt <felix.willgerodt@intel.com>
* floatformat.c (floatformat_bfloat16_big): New.
(floatformat_bfloat16_little): New.
David Malcolm [Mon, 7 Sep 2020 22:31:28 +0000 (18:31 -0400)]
analyzer: fix another ICE in constructor-handling [PR96949]
PR analyzer/96949 reports an ICE with
--param analyzer-max-svalue-depth=0, where the param value leads
to INTEGER_CST values in a RANGE_EXPR being treated as unknown
symbolic values.
This patch replaces implicit assumptions that these values are
concrete (and thus have concrete bit offsets), adding
error-handling for symbolic cases instead of assertions.
gcc/analyzer/ChangeLog:
PR analyzer/96949
* store.cc (binding_map::apply_ctor_val_to_range): Add
error-handling for the cases where we have symbolic offsets.
gcc/testsuite/ChangeLog:
PR analyzer/96949
* gfortran.dg/analyzer/pr96949.f90: New test.
David Malcolm [Mon, 7 Sep 2020 21:43:02 +0000 (17:43 -0400)]
analyzer: fix ICE on RANGE_EXPR with CONSTRUCTOR value [PR96950]
gcc/analyzer/ChangeLog:
PR analyzer/96950
* store.cc (binding_map::apply_ctor_to_region): Handle RANGE_EXPR
where min_index == max_index.
(binding_map::apply_ctor_val_to_range): Replace assertion that we
don't have a CONSTRUCTOR value with error-handling.
David Malcolm [Mon, 7 Sep 2020 21:16:37 +0000 (17:16 -0400)]
analyzer: fix ICE on machine-specific builtins [PR96962]
In g:
ee7bfbe5eb70a23bbf3a2cedfdcbd2ea1a20c3f2 I added a
switch (DECL_UNCHECKED_FUNCTION_CODE (callee_fndecl))
to region_model::on_call_pre guarded by
fndecl_built_in_p (callee_fndecl).
I meant to handle only normal built-ins, whereas this
single-argument overload of fndecl_built_in_p returns true for any
kind of built-in.
PR analyzer/96962 reports a case where this matches for a
machine-specific builtin, leading to an ICE. Fixed thusly.
gcc/analyzer/ChangeLog:
PR analyzer/96962
* region-model.cc (region_model::on_call_pre): Fix guard on switch
on built-ins to only consider BUILT_IN_NORMAL, rather than other
kinds of build-ins.
Aldy Hernandez [Tue, 8 Sep 2020 07:42:03 +0000 (07:42 +0000)]
PR tree-optimization/96967 - cast label range to type of switch operand
PR tree-optimization/96967
* tree-vrp.c (find_case_label_range): Cast label range to
type of switch operand.
Jozef Lawrynowicz [Tue, 8 Sep 2020 10:31:02 +0000 (11:31 +0100)]
MSP430: Fix detection of assembler support for .mspabi_attribute
The assembly code ".mspabi_attribute 4,1" uses the object attribute
mechanism to indicate that the 430 ISA is in use. However, the default
ISA is 430X, so GAS fails to assemble this since the ISA wasn't also set
to 430 on the command line.
gcc/ChangeLog:
* config/msp430/msp430.c (msp430_file_end): Fix jumbled
HAVE_AS_MSPABI_ATTRIBUTE and HAVE_AS_GNU_ATTRIBUTE checks.
* configure: Regenerate.
* configure.ac: Use ".mspabi_attribute 4,2" to check for assembler
support for this object attribute directive.
Iain Buclaw [Mon, 7 Sep 2020 13:43:04 +0000 (15:43 +0200)]
libphobos: libdruntime doesn't support shadow stack (PR95680)
Rather than implementing support within D runtime itself, use libc
getcontext/swapcontext functions if CET is enabled.
Removes whatever CET support was in the switchContext routine for x86
D runtime, along with setting version AsmExternal, so that the fallback
ucontext_t implementation is used, which is capable of doing shadow
stack handling.
libphobos/ChangeLog:
PR d/95680
* Makefile.in: Regenerate.
* configure: Regenerate.
* configure.ac (DCFG_ENABLE_CET): Substitute.
* libdruntime/Makefile.in: Regenerate.
* libdruntime/config/x86/switchcontext.S: Remove CET support code.
* libdruntime/core/thread.d: Import gcc.config. Don't set version
AsmExternal when GNU_Enable_CET is true.
* libdruntime/gcc/config.d.in (GNU_Enable_CET): Define.
* src/Makefile.in: Regenerate.
* testsuite/Makefile.in: Regenerate.
Jozef Lawrynowicz [Tue, 8 Sep 2020 09:10:17 +0000 (10:10 +0100)]
MSP430: Use enums to handle -mcpu= values
The -mcpu= option accepts only a handful of string values.
Using enums instead of strings to handle the accepted values removes the
need to have specific processing of the strings in the backend, and
simplifies any comparisons which need to be performed on the value.
It also allows the default value to have semantic equivalence to a user
set value, whilst retaining the ability to differentiate between them.
Practically, this allows a user set -mcpu= value to override the the ISA set by
-mmcu, whilst the default -mcpu= value can still have an explicit meaning.
gcc/ChangeLog:
* common/config/msp430/msp430-common.c (msp430_handle_option): Remove
OPT_mcpu_ handling.
Set target_cpu value to new enum values when parsing certain -mmcu=
values.
* config/msp430/msp430-opts.h (enum msp430_cpu_types): New.
* config/msp430/msp430.c (msp430_option_override): Handle new
target_cpu enum values.
Set target_cpu using extracted value for given MCU when -mcpu=
option is not passed by the user.
* config/msp430/msp430.opt: Handle -mcpu= values using enums.
gcc/testsuite/ChangeLog:
* gcc.target/msp430/mcpu-is-430.c: New test.
* gcc.target/msp430/mcpu-is-430x.c: New test.
* gcc.target/msp430/mcpu-is-430xv2.c: New test.
Thomas Koenig [Tue, 8 Sep 2020 06:13:29 +0000 (08:13 +0200)]
Fix description of FINDLOC result.
gcc/fortran/ChangeLog:
* intrinsic.texi: Fix description of FINDLOC result.
Alan Modra [Wed, 2 Sep 2020 02:53:47 +0000 (12:23 +0930)]
ubsan: d-demangle.c:214 signed integer overflow
Running the libiberty testsuite
./test-demangle < libiberty/testsuite/d-demangle-expected
libiberty/d-demangle.c:214:14: runtime error: signed integer overflow:
922337203 * 10 cannot be represented in type 'long int'
On looking at silencing ubsan, I found a real bug in dlang_number.
For a 32-bit long, some overflows won't be detected. For example,
21474836480. Why? Well
214748364 * 10 is 0x7FFFFFF8 (no overflow so
far). Adding 8 gives 0x80000000 (which does overflow but there is no
test for that overflow in the code). Then multiplying 0x80000000 * 10
= 0x500000000 = 0 won't be caught by the multiplication overflow test.
The same holds for a 64-bit long using similarly crafted digit
sequences.
* d-demangle.c: Include limits.h.
(ULONG_MAX, UINT_MAX): Provide fall-back definition.
(dlang_number): Simplify and correct overflow test. Only
write *ret on returning non-NULL. Make "ret" an unsigned long*.
Only succeed for result of [0,UINT_MAX].
(dlang_decode_backref): Simplify and correct overflow test.
Only write *ret on returning non-NULL. Only succeed for
result [1,MAX_LONG].
(dlang_backref): Remove now unnecessary range check.
(dlang_symbol_name_p): Likewise.
(string_need): Take a size_t n arg, and use size_t tem.
(string_append): Use size_t n.
(string_appendn, string_prependn): Take a size_t n arg.
(TEMPLATE_LENGTH_UNKNOWN): Define as -1UL.
(dlang_lname, dlang_parse_template): Take an unsigned long len
arg.
(dlang_symbol_backref, dlang_identifier, dlang_parse_integer),
(dlang_parse_integer, dlang_parse_string),
(dlang_parse_arrayliteral, dlang_parse_assocarray),
(dlang_parse_structlit, dlang_parse_tuple),
(dlang_template_symbol_param, dlang_template_args): Use
unsigned long variables.
* testsuite/d-demangle-expected: Add new tests.
GCC Administrator [Tue, 8 Sep 2020 00:16:32 +0000 (00:16 +0000)]
Daily bump.
Harald Anlauf [Mon, 7 Sep 2020 19:41:45 +0000 (21:41 +0200)]
PR fortran/96711 - ICE with NINT() for integer(16) result
When rounding a real to the nearest integer, temporarily convert the real
argument to a longer real kind when the result is of type/kind integer(16).
gcc/fortran/ChangeLog:
* trans-intrinsic.c (build_round_expr): Use temporary with
appropriate kind for conversion before rounding to nearest
integer when the result precision is 128 bits.
gcc/testsuite/ChangeLog:
* gfortran.dg/pr96711.f90: New test.
Richard Sandiford [Mon, 7 Sep 2020 19:15:36 +0000 (20:15 +0100)]
lra: Avoid cycling on certain subreg reloads [PR96796]
This PR is about LRA cycling for a reload of the form:
----------------------------------------------------------------------------
Changing pseudo 196 in operand 1 of insn 103 on equiv [r105:DI*0x8+r140:DI]
Creating newreg=287, assigning class ALL_REGS to slow/invalid mem r287
Creating newreg=288, assigning class ALL_REGS to slow/invalid mem r288
103: r203:SI=r288:SI<<0x1+r196:DI#0
REG_DEAD r196:DI
Inserting slow/invalid mem reload before:
316: r287:DI=[r105:DI*0x8+r140:DI]
317: r288:SI=r287:DI#0
----------------------------------------------------------------------------
The problem is with r287. We rightly give it a broad starting class of
POINTER_AND_FP_REGS (reduced from ALL_REGS by preferred_reload_class).
However, we never make forward progress towards narrowing it down to
a specific choice of class (POINTER_REGS or FP_REGS).
I think in practice we rely on two things to narrow a reload pseudo's
class down to a specific choice:
(1) a restricted class is specified when the pseudo is created
This happens for input address reloads, where the class is taken
from the target's chosen base register class. It also happens
for simple REG reloads, where the class is taken from the chosen
alternative's constraints.
(2) uses of the reload pseudo as a direct input operand
In this case get_reload_reg tries to reuse the existing register
and narrow its class, instead of creating a new reload pseudo.
However, neither occurs here. As described above, r287 rightly
starts out with a wide choice of class, ultimately derived from
ALL_REGS, so we don't get (1). And as the comments in the PR
explain, r287 is never used as an input reload, only the subreg is,
so we don't get (2):
----------------------------------------------------------------------------
Choosing alt 13 in insn 317: (0) r (1) w {*movsi_aarch64}
Creating newreg=291, assigning class FP_REGS to r291
317: r288:SI=r291:SI
Inserting insn reload before:
320: r291:SI=r287:DI#0
----------------------------------------------------------------------------
IMO, in this case we should rely on the reload of r316 to narrow
down the class of r278. Currently we do:
----------------------------------------------------------------------------
Choosing alt 7 in insn 316: (0) r (1) m {*movdi_aarch64}
Creating newreg=289 from oldreg=287, assigning class GENERAL_REGS to r289
316: r289:DI=[r105:DI*0x8+r140:DI]
Inserting insn reload after:
318: r287:DI=r289:DI
---------------------------------------------------
i.e. we create a new pseudo register r289 and give *that* pseudo
GENERAL_REGS instead. This is because get_reload_reg only narrows
down the existing class for OP_IN and OP_INOUT, not OP_OUT.
But if we have a reload pseudo in a reload instruction and have chosen
a specific class for the reload pseudo, I think we should simply install
it for OP_OUT reloads too, if the class is a subset of the existing class.
We will need to pick such a register whatever happens (for r289 in the
example above). And as explained in the PR, doing this actually avoids
an unnecessary move via the FP registers too.
The patch is quite aggressive in that it does this for all reload
pseudos in all reload instructions. I wondered about reusing the
condition for a reload move in in_class_p:
INSN_UID (curr_insn) >= new_insn_uid_start
&& curr_insn_set != NULL
&& ((OBJECT_P (SET_SRC (curr_insn_set))
&& ! CONSTANT_P (SET_SRC (curr_insn_set)))
|| (GET_CODE (SET_SRC (curr_insn_set)) == SUBREG
&& OBJECT_P (SUBREG_REG (SET_SRC (curr_insn_set)))
&& ! CONSTANT_P (SUBREG_REG (SET_SRC (curr_insn_set)))))))
but I can't really justify that on first principles. I think we
should apply the rule consistently until we have a specific reason
for doing otherwise.
gcc/
PR rtl-optimization/96796
* lra-constraints.c (in_class_p): Add a default-false
allow_all_reload_class_changes_p parameter. Do not treat
reload moves specially when the parameter is true.
(get_reload_reg): Try to narrow the class of an existing OP_OUT
reload if we're reloading a reload pseudo in a reload instruction.
gcc/testsuite/
PR rtl-optimization/96796
* gcc.c-torture/compile/pr96796.c: New test.
Jonathan Wakely [Mon, 7 Sep 2020 19:09:17 +0000 (20:09 +0100)]
libstdc++: Simplify chrono::duration::_S_gcd
We can simplify this constexpr function further because we know that
period::num >= 1 and period::den >= 1 so only the remainder can ever be
zero.
libstdc++-v3/ChangeLog:
* include/std/chrono (duration::_S_gcd): Use invariant that
neither value is zero initially.
Jonathan Wakely [Mon, 7 Sep 2020 19:09:17 +0000 (20:09 +0100)]
libstdc++: Simplify constraints for semiregular-box [LWG 3477]
libstdc++-v3/ChangeLog:
* include/std/ranges (__box): Simplify constraints as per LWG 3477.
Andrea Corallo [Mon, 7 Sep 2020 12:45:47 +0000 (13:45 +0100)]
vec: Revert "dead code removal in tree-vect-loop.c" and add a comment.
gcc/ChangeLog
2020-09-07 Andrea Corallo <andrea.corallo@arm.com>
* tree-vect-loop.c (vect_estimate_min_profitable_iters): Revert
dead-code removal introduced by
09fa6acd8d9 + add a comment to
clarify.
Jozef Lawrynowicz [Mon, 7 Sep 2020 16:52:04 +0000 (17:52 +0100)]
doc: Update documentation on MODE_PARTIAL_INT subregs
In
d8487c949ad5, MODE_PARTIAL_INT modes were changed from having an
unknown number of undefined bits, to having a known number of undefined
bits, however the documentation on using SUBREG expressions with
MODE_PARTIAL_INT modes was not updated to reflect this.
gcc/ChangeLog:
* doc/rtl.texi (subreg): Fix documentation to state there is a known
number of undefined bits in regs and subregs of MODE_PARTIAL_INT modes.
Jozef Lawrynowicz [Mon, 7 Sep 2020 16:35:04 +0000 (17:35 +0100)]
MSP430: Don't override default ISA when MCU name is unrecognized
430X is the default ISA under normal operation, so even when the MCU name
passed to -mmcu= is unrecognized, it should not be overriden.
gcc/ChangeLog:
* config/msp430/msp430.c (msp430_option_override): Don't set the
ISA to 430 when the MCU is unrecognized.
gcc/testsuite/ChangeLog:
* gcc.target/msp430/430x-default-isa.c: New test.
Iain Sandoe [Mon, 7 Sep 2020 08:23:16 +0000 (09:23 +0100)]
Darwin, testsuite : Update pubtypes tests.
Recent changes in debug output have resulted in a change
in the length of the pub types info. This updates the tests to
reflect the new length.
gcc/testsuite/ChangeLog:
* gcc.dg/pubtypes-2.c: Amend Pub Info Length.
* gcc.dg/pubtypes-3.c: Likewise.
* gcc.dg/pubtypes-4.c: Likewise.
Iain Sandoe [Mon, 7 Sep 2020 08:21:40 +0000 (09:21 +0100)]
Darwin : Update libc function availability.
Darwin libc has sincos from 10.9 (darwin13) onwards.
gcc/ChangeLog:
* config/darwin.c (darwin_libc_has_function): Report sincos
available from 10.9.
Alex Coplan [Mon, 7 Sep 2020 14:23:44 +0000 (15:23 +0100)]
aarch64: Remove redundant mult patterns
Following on from the previous commit to fix up the syntax for
add/sub/adds/subs and friends with a sign/zero-extended operand, this
patch removes the "mult" variants of these patterns which are all
redundant.
This patch removes the following patterns from the AArch64 backend:
*adds_mul_imm_<mode>
*subs_mul_imm_<mode>
*adds_<optab><mode>_multp2
*subs_<optab><mode>_multp2
*add_mul_imm_<mode>
*add_<optab><ALLX:mode>_mult_<GPI:mode>
*add_<optab><SHORT:mode>_mult_si_uxtw
*add_<optab><mode>_multp2
*add_<optab>si_multp2_uxtw
*add_uxt<mode>_multp2
*add_uxtsi_multp2_uxtw
*sub_mul_imm_<mode>
*sub_mul_imm_si_uxtw
*sub_<optab><mode>_multp2
*sub_<optab>si_multp2_uxtw
*sub_uxt<mode>_multp2
*sub_uxtsi_multp2_uxtw
*neg_mul_imm_<mode>2
*neg_mul_imm_si2_uxtw
Together with the following predicates which were used only by these
patterns:
aarch64_pwr_imm3
aarch64_pwr_2_si
aarch64_pwr_2_di
These patterns are all redundant since multiplications by powers of two
should be represented as shfits outside a (mem).
---
gcc/ChangeLog:
* config/aarch64/aarch64.md (*adds_mul_imm_<mode>): Delete.
(*subs_mul_imm_<mode>): Delete.
(*adds_<optab><mode>_multp2): Delete.
(*subs_<optab><mode>_multp2): Delete.
(*add_mul_imm_<mode>): Delete.
(*add_<optab><ALLX:mode>_mult_<GPI:mode>): Delete.
(*add_<optab><SHORT:mode>_mult_si_uxtw): Delete.
(*add_<optab><mode>_multp2): Delete.
(*add_<optab>si_multp2_uxtw): Delete.
(*add_uxt<mode>_multp2): Delete.
(*add_uxtsi_multp2_uxtw): Delete.
(*sub_mul_imm_<mode>): Delete.
(*sub_mul_imm_si_uxtw): Delete.
(*sub_<optab><mode>_multp2): Delete.
(*sub_<optab>si_multp2_uxtw): Delete.
(*sub_uxt<mode>_multp2): Delete.
(*sub_uxtsi_multp2_uxtw): Delete.
(*neg_mul_imm_<mode>2): Delete.
(*neg_mul_imm_si2_uxtw): Delete.
* config/aarch64/predicates.md (aarch64_pwr_imm3): Delete.
(aarch64_pwr_2_si): Delete.
(aarch64_pwr_2_di): Delete.
Alex Coplan [Mon, 7 Sep 2020 14:20:21 +0000 (15:20 +0100)]
aarch64: Don't emit invalid zero/sign-extend syntax
Given the following C function:
double *f(double *p, unsigned x)
{
return p + x;
}
prior to this patch, GCC at -O2 would generate:
f:
add x0, x0, x1, uxtw 3
ret
but this add instruction uses architecturally-invalid syntax: the width
of the third operand conflicts with the width of the extension
specifier. The third operand is only permitted to be an x register when
the extension specifier is (u|s)xtx.
This instruction, and analogous insns for adds, sub, subs, and cmp, are
rejected by clang, but accepted by binutils. Assembling and
disassembling such an insn with binutils gives the architecturally-valid
version in the disassembly:
0:
8b214c00 add x0, x0, w1, uxtw #3
This patch fixes several patterns in the AArch64 backend to use the
standard syntax as specified in the Arm ARM such that GCC's output can
be assembled by assemblers other than GAS.
---
gcc/ChangeLog:
* config/aarch64/aarch64.md
(*adds_<optab><ALLX:mode>_<GPI:mode>): Ensure extended operand
agrees with width of extension specifier.
(*subs_<optab><ALLX:mode>_<GPI:mode>): Likewise.
(*adds_<optab><ALLX:mode>_shift_<GPI:mode>): Likewise.
(*subs_<optab><ALLX:mode>_shift_<GPI:mode>): Likewise.
(*add_<optab><ALLX:mode>_<GPI:mode>): Likewise.
(*add_<optab><ALLX:mode>_shft_<GPI:mode>): Likewise.
(*add_uxt<mode>_shift2): Likewise.
(*sub_<optab><ALLX:mode>_<GPI:mode>): Likewise.
(*sub_<optab><ALLX:mode>_shft_<GPI:mode>): Likewise.
(*sub_uxt<mode>_shift2): Likewise.
(*cmp_swp_<optab><ALLX:mode>_reg<GPI:mode>): Likewise.
(*cmp_swp_<optab><ALLX:mode>_shft_<GPI:mode>): Likewise.
gcc/testsuite/ChangeLog:
* gcc.target/aarch64/adds3.c: Fix test w.r.t. new syntax.
* gcc.target/aarch64/cmp.c: Likewise.
* gcc.target/aarch64/subs3.c: Likewise.
* gcc.target/aarch64/subsp.c: Likewise.
* gcc.target/aarch64/extend-syntax.c: New test.
Richard Biener [Mon, 7 Sep 2020 12:26:46 +0000 (14:26 +0200)]
improve SLP vect dumping
This adds additional dumping helping in particular basic-block
vectorization SLP dump reading plus showing what we actually
generate code from.
2020-09-07 Richard Biener <rguenther@suse.de>
* tree-vect-slp.c (vect_analyze_slp_instance): Dump
stmts we start SLP analysis from, failure and splitting.
(vect_schedule_slp): Dump SLP graph entry and root stmt
we are about to emit code for.
Martin Storsjö [Mon, 7 Sep 2020 11:18:42 +0000 (13:18 +0200)]
gcc: Make strchr return value pointers const
This fixes compilation of codepaths for dos-like filesystems
with Clang. When built with clang, it treats C input files as C++
when the compiler driver is invoked in C++ mode, triggering errors
when the return value of strchr() on a pointer to const is assigned
to a pointer to non-const variable.
This matches similar variables outside of the ifdefs for dos-like
path handling.
2020-09-07 Martin Storsjö <martin@martin.st>
gcc/
* dwarf2out.c (file_name_acquire): Make a strchr return value
pointer to const.
libcpp/
* files.c (remap_filename): Make a strchr return value pointer
to const.
Tobias Burnus [Mon, 7 Sep 2020 10:29:05 +0000 (12:29 +0200)]
Fortran: Fixes for pointer function call as variable (PR96896)
gcc/fortran/ChangeLog:
PR fortran/96896
* resolve.c (get_temp_from_expr): Also reset proc_pointer +
use_assoc attribute.
(resolve_ptr_fcn_assign): Use information from the LHS.
gcc/testsuite/ChangeLog:
PR fortran/96896
* gfortran.dg/ptr_func_assign_4.f08: Update dg-error.
* gfortran.dg/ptr-func-3.f90: New test.
Tom de Vries [Mon, 7 Sep 2020 09:54:27 +0000 (11:54 +0200)]
[libatomic, testsuite] Add missing include in atomic-generic.c
When compiling atomic-generic.c from the libatomic testsuite, we run into:
...
$ gcc src/libatomic/testsuite/libatomic.c/atomic-generic.c -latomic
src/libatomic/testsuite/libatomic.c/atomic-generic.c: In function ‘main’:
src/libatomic/testsuite/libatomic.c/atomic-generic.c:31:7: warning: \
implicit declaration of function ‘memcmp’ [-Wimplicit-function-declaration]
if (memcmp (&a, &zero, size))
^~~~~~
...
Fix this by adding the missing string.h include.
Tested on x86_64.
libatomic/ChangeLog:
* testsuite/libatomic.c/atomic-generic.c: Include string.h.
liuhongt [Mon, 7 Sep 2020 08:29:24 +0000 (16:29 +0800)]
Adjust testcase.
gcc/testsuite/ChangeLog:
* gcc.dg/vect/slp-46.c: Add --param vect-epilogues-nomask=0 to
void backend interference.
Jakub Jelinek [Mon, 7 Sep 2020 07:54:38 +0000 (09:54 +0200)]
lto: Stream edge goto_locus [PR94235]
The following patch adds streaming of edge goto_locus (both LOCATION_LOCUS
and LOCATION_BLOCK from it), the PR shows a testcase (inappropriate for
gcc testsuite) where the lack of streaming of goto_locus results in worse
debug info.
Earlier version of the patch (without the output_function changes) failed
miserably, because on the order mismatch - input_function would
first input_cfg, then input_eh_regions and then input_bb (all of which now
have locations), while output_function used output_eh_regions, then output_bb
and then output_cfg. *_cfg went to a separate stream...
Now, is there a reason why the order is different?
If the intent is that the cfg could be read separately from the rest of
function or vice versa, alternatively we'd need to clear_line_info ();
before output_eh_regions and before/after output_cfg to make them
independent.
2020-09-07 Jakub Jelinek <jakub@redhat.com>
PR debug/94235
* lto-streamer-out.c (output_cfg): Also stream goto_locus for edges.
Use bp_pack_var_len_unsigned instead of streamer_write_uhwi to stream
e->dest->index and e->flags.
(output_function): Call output_cfg before output_ssa_name, rather than
after streaming all bbs.
* lto-streamer-in.c (input_cfg): Stream in goto_locus for edges.
Use bp_unpack_var_len_unsigned instead of streamer_read_uhwi to stream
in dest_index and edge_flags.
Richard Biener [Fri, 4 Sep 2020 13:33:19 +0000 (15:33 +0200)]
code generate live lanes in basic-block vectorization
The following adds the capability to code-generate live lanes in
basic-block vectorization using lane extracts from vector stmts
rather than keeping the original scalar code around for those.
This eventually makes previously not profitable vectorizations
profitable (the live scalar code was appropriately costed so
are the lane extracts now), without considering the cost model
this patch doesn't add or remove any basic-block vectorization
capabilities.
The patch re/ab-uses STMT_VINFO_LIVE_P in basic-block vectorization
mode to tell whether a live lane is vectorized or whether it is
provided by means of keeping the scalar code live.
The patch is a first step towards vectorizing sequences of
stmts that do not end up in stores or vector constructors though.
Bootstrapped and tested on x86_64-unknown-linux-gnu.
2020-09-04 Richard Biener <rguenther@suse.de>
* tree-vectorizer.h (vectorizable_live_operation): Adjust.
* tree-vect-loop.c (vectorizable_live_operation): Vectorize
live lanes out of basic-block vectorization nodes.
* tree-vect-slp.c (vect_bb_slp_mark_live_stmts): New function.
(vect_slp_analyze_operations): Analyze live lanes and their
vectorization possibility after the whole SLP graph is final.
(vect_bb_slp_scalar_cost): Adjust for vectorized live lanes.
* tree-vect-stmts.c (can_vectorize_live_stmts): Adjust.
(vect_transform_stmt): Call can_vectorize_live_stmts also for
basic-block vectorization.
* gcc.dg/vect/bb-slp-46.c: New testcase.
* gcc.dg/vect/bb-slp-47.c: Likewise.
* gcc.dg/vect/bb-slp-32.c: Adjust.
Francois-Xavier Coudert [Mon, 7 Sep 2020 07:38:25 +0000 (09:38 +0200)]
fortran: Fix argument types in derived types procedures
gcc/fortran/ChangeLog
* trans-types.c (gfc_get_derived_type): Fix argument types.
Francois-Xavier Coudert [Mon, 7 Sep 2020 07:36:29 +0000 (09:36 +0200)]
fortran: Fix arg types of _gfortran_is_extension_of
gcc/fortran/ChangeLog
* resolve.c (resolve_select_type): Provide a formal arg list.
liuhongt [Mon, 7 Sep 2020 07:23:39 +0000 (15:23 +0800)]
Adjust testcase.
gcc/testsuite/ChangeLog:
* gcc.target/i386/pr92658-avx512bw-trunc.c: Add
-mprefer-vector-width=512 to avoid impact of different default
tune which gcc is built with.
GCC Administrator [Mon, 7 Sep 2020 00:16:22 +0000 (00:16 +0000)]
Daily bump.
Francois-Xavier Coudert [Sun, 6 Sep 2020 16:36:20 +0000 (18:36 +0200)]
fortran: Add comment about previous commit
gcc/fortran/ChangeLog
* trans-types.c (gfc_get_ppc_type): Add comment.
Francois-Xavier Coudert [Sun, 6 Sep 2020 16:33:04 +0000 (18:33 +0200)]
fortran: Fix function arg types for class objects
gcc/fortran/ChangeLog
* trans-types.c (gfc_get_ppc_type): Fix function arg types.
Francois-Xavier Coudert [Sun, 6 Sep 2020 16:24:50 +0000 (18:24 +0200)]
fortran: caf_fail_image expects no argument
gcc/fortran/ChangeLog
PR fortran/96947
* trans-stmt.c (gfc_trans_fail_image): caf_fail_image
expects no argument.
gcc/testsuite/ChangeLog
* gfortran.dg/coarray_fail_st.f90: Adjust test.
GCC Administrator [Sun, 6 Sep 2020 00:16:20 +0000 (00:16 +0000)]
Daily bump.
GCC Administrator [Sat, 5 Sep 2020 00:16:20 +0000 (00:16 +0000)]
Daily bump.
Iain Buclaw [Fri, 4 Sep 2020 20:54:22 +0000 (22:54 +0200)]
d: Fix ICE in create_tmp_var, at gimple-expr.c:482
Array concatenate expressions were creating more SAVE_EXPRs than what
was necessary. The internal error itself was the result of a forced
temporary being made on a TREE_ADDRESSABLE type.
gcc/d/ChangeLog:
PR d/96924
* expr.cc (ExprVisitor::visit (CatAssignExp *)): Don't force
temporaries needlessly.
gcc/testsuite/ChangeLog:
PR d/96924
* gdc.dg/simd13927b.d: Removed.
* gdc.dg/pr96924.d: New test.
Jason Merrill [Wed, 2 Sep 2020 21:53:24 +0000 (17:53 -0400)]
c++: Use iloc_sentinel in mark_use.
gcc/cp/ChangeLog:
* expr.c (mark_use): Use iloc_sentinel.
Richard Biener [Fri, 4 Sep 2020 12:35:39 +0000 (14:35 +0200)]
tree-optimization/96920 - another ICE when vectorizing nested cycles
This refines the previous fix for PR96698 by re-doing how and where
we arrange for setting vectorized cycle PHI backedge values.
2020-09-04 Richard Biener <rguenther@suse.de>
PR tree-optimization/96698
PR tree-optimization/96920
* tree-vectorizer.h (loop_vec_info::reduc_latch_defs): Remove.
(loop_vec_info::reduc_latch_slp_defs): Likewise.
* tree-vect-stmts.c (vect_transform_stmt): Remove vectorized
cycle PHI latch code.
* tree-vect-loop.c (maybe_set_vectorized_backedge_value): New
helper to set vectorized cycle PHI latch values.
(vect_transform_loop): Walk over all PHIs again after
vectorizing them, calling maybe_set_vectorized_backedge_value.
Call maybe_set_vectorized_backedge_value for each vectorized
stmt. Remove delayed update code.
* tree-vect-slp.c (vect_analyze_slp_instance): Initialize
SLP instance reduc_phis member.
(vect_schedule_slp): Set vectorized cycle PHI latch values.
* gfortran.dg/vect/pr96920.f90: New testcase.
* gcc.dg/vect/pr96920.c: Likewise.
Andrea Corallo [Fri, 4 Sep 2020 08:56:59 +0000 (09:56 +0100)]
vec: dead code removal in tree-vect-loop.c
gcc/ChangeLog
2020-09-04 Andrea Corallo <andrea.corallo@arm.com>
* tree-vect-loop.c (vect_estimate_min_profitable_iters): Remove
dead code as LOOP_VINFO_USING_PARTIAL_VECTORS_P (loop_vinfo) is
always verified.
Christophe Lyon [Fri, 4 Sep 2020 11:48:36 +0000 (11:48 +0000)]
arm: Improve immediate generation for thumb-1 with -mpurecode [PR96769]
This patch moves the move-immediate splitter after the regular ones so
that it has lower precedence, and updates its constraints.
For
int f3 (void) { return 0x11000000; }
int f3_2 (void) { return 0x12345678; }
we now generate:
* with -O2 -mcpu=cortex-m0 -mpure-code:
f3:
movs r0, #136
lsls r0, r0, #21
bx lr
f3_2:
movs r0, #18
lsls r0, r0, #8
adds r0, r0, #52
lsls r0, r0, #8
adds r0, r0, #86
lsls r0, r0, #8
adds r0, r0, #121
bx lr
* with -O2 -mcpu=cortex-m23 -mpure-code:
f3:
movs r0, #136
lsls r0, r0, #21
bx lr
f3_2:
movw r0, #22136
movt r0, 4660
bx lr
2020-09-04 Christophe Lyon <christophe.lyon@linaro.org>
PR target/96769
gcc/
* config/arm/thumb1.md: Move movsi splitter for
arm_disable_literal_pool after the other movsi splitters.
gcc/testsuite/
* gcc.target/arm/pure-code/pr96769.c: New test.
Aldy Hernandez [Fri, 4 Sep 2020 07:05:04 +0000 (09:05 +0200)]
rename widest_irange to int_range_max.
gcc/ChangeLog:
* range-op.cc (range_operator::fold_range): Rename widest_irange
to int_range_max.
(operator_div::wi_fold): Same.
(operator_lshift::op1_range): Same.
(operator_rshift::op1_range): Same.
(operator_cast::fold_range): Same.
(operator_cast::op1_range): Same.
(operator_bitwise_and::remove_impossible_ranges): Same.
(operator_bitwise_and::op1_range): Same.
(operator_abs::op1_range): Same.
(range_cast): Same.
(widest_irange_tests): Same.
(range3_tests): Rename irange3 to int_range3.
(int_range_max_tests): Rename from widest_irange_tests.
Rename widest_irange to int_range_max.
(operator_tests): Rename widest_irange to int_range_max.
(range_tests): Same.
* tree-vrp.c (find_case_label_range): Same.
* value-range.cc (irange::irange_intersect): Same.
(irange::invert): Same.
* value-range.h: Same.
Richard Biener [Fri, 4 Sep 2020 10:18:38 +0000 (12:18 +0200)]
tree-optimization/96931 - clear ctrl-altering flag more aggressively
The testcase shows that we fail to clear gimple_call_ctrl_altering_p
when the last abnormal edge goes away, causing an edge insert to
a loop header edge when we have preheaders to split the edge
unnecessarily.
The following addresses this by more aggressively clearing the
flag in cleanup_call_ctrl_altering_flag.
2020-09-04 Richard Biener <rguenther@suse.de>
PR tree-optimization/96931
* tree-cfgcleanup.c (cleanup_call_ctrl_altering_flag): If
there's a fallthru edge and no abnormal edge the call is
no longer control-altering.
(cleanup_control_flow_bb): Pass down the BB to
cleanup_call_ctrl_altering_flag.
* gcc.dg/pr96931.c: New testcase.
Jakub Jelinek [Fri, 4 Sep 2020 09:55:13 +0000 (11:55 +0200)]
lto: Remove stream_input_location_now
As discussed yesterday, stream_input_location_now has been used in 3
remaining places. For ERT_MUST_NOT_THROW, I believe the failure_loc
location is stable at least until the apply_cache after the bbs are all
read, and the locations do not include BLOCK, so we can use normal
stream_input_location, and the two input_struct_function_base also
shouldn't include BLOCK and are stable at least until that same apply_cache
after reading all bbs, so again we can use the location cache.
2020-09-04 Jakub Jelinek <jakub@redhat.com>
* lto-streamer.h (stream_input_location_now): Remove declaration.
* lto-streamer-in.c (stream_input_location_now): Remove.
(input_eh_region, input_struct_function_base): Use
stream_input_location instead of stream_input_location_now.
Jakub Jelinek [Fri, 4 Sep 2020 09:53:28 +0000 (11:53 +0200)]
lto: Ensure we force a change for file/line/column after clear_line_info
As discussed yesterday:
On the streamer out side, we call clear_line_info
in multiple spots which resets the current_* values to something, but on the
reader side, we don't have corresponding resets in the same location, just have
the stream_* static variables that keep the current values through the
entire stream in (so across all the clear_line_info spots in a single LTO
object but also across jumping from one LTO object to another one).
Now, in an earlier version of my patch it actually broke LTO bootstrap
(and a lot of LTO testcases), so for the BLOCK case I've solved it by
clear_line_info setting current_block to something that should never appear,
which means that in the LTO stream after the clear_line_info spots including
the start of the LTO stream we force the block change bit to be set and thus
BLOCK to be streamed and therefore stream_block from earlier to be
ignored. But for the rest I think that is not the case, so I wonder if we
don't sometimes end up with wrong line/column info because of that, or
please tell me what prevents that.
clear_line_info does:
ob->current_file = NULL;
ob->current_line = 0;
ob->current_col = 0;
ob->current_sysp = false;
while I think NULL current_file is something that should likely be different
from expanded_location (...).file (UNKNOWN_LOCATION/BUILTINS_LOCATION are
handled separately and not go through the caching), I think line number 0
can sometimes occur and especially column 0 occurs frequently if we ran out
of location_t with columns info. But then we do:
bp_pack_value (bp, ob->current_file != xloc.file, 1);
bp_pack_value (bp, ob->current_line != xloc.line, 1);
bp_pack_value (bp, ob->current_col != xloc.column, 1);
and stream the details only if the != is true. If that happens immediately
after clear_line_info and e.g. xloc.column is 0, we would stream 0 bit and
not stream the actual value, so on read-in it would reuse whatever
stream_col etc. were before. Shouldn't we set some ob->current_* new bit
that would signal we are immediately past clear_line_info which would force
all these != checks to non-zero? Either by oring something into those
tests, or perhaps:
if (ob->current_reset)
{
if (xloc.file == NULL)
ob->current_file = "";
if (xloc.line == 0)
ob->current_line = 1;
if (xloc.column == 0)
ob->current_column = 1;
ob->current_reset = false;
}
before doing those bp_pack_value calls with a comment, effectively forcing
all 6 != comparisons to be true?
2020-09-04 Jakub Jelinek <jakub@redhat.com>
* lto-streamer.h (struct output_block): Add reset_locus member.
* lto-streamer-out.c (clear_line_info): Set reset_locus to true.
(lto_output_location_1): If reset_locus, clear it and ensure
current_{file,line,col} is different from xloc members.
David Faust [Fri, 4 Sep 2020 08:18:56 +0000 (10:18 +0200)]
bpf: generate indirect calls for xBPF
This patch updates the BPF back end to generate indirect calls via
the 'call %reg' instruction when targetting xBPF.
Additionally, the BPF ASM_SPEC is updated to pass along -mxbpf to
gas, where it is now supported.
2020-09-03 David Faust <david.faust@oracle.com>
gcc/
* config/bpf/bpf.h (ASM_SPEC): Pass -mxbpf to gas, if specified.
* config/bpf/bpf.c (bpf_output_call): Support indirect calls in xBPF.
gcc/testsuite/
* gcc.target/bpf/xbpf-indirect-call-1.c: New test.
Kewen Lin [Fri, 4 Sep 2020 02:58:39 +0000 (21:58 -0500)]
test/rs6000: Replace test targets p8 and p9+
This patch is to clean existing rs6000 test targets p8 and p9+
with existing has_arch_pwr8 and has_arch_pwr9 targets combination
or only one of them.
gcc/testsuite/ChangeLog:
* gcc.target/powerpc/pr92398.p9+.c: Replace p9+ with has_arch_pwr9.
* gcc.target/powerpc/pr92398.p9-.c: Replace p9+ with has_arch_pwr9,
and replace p8 with has_arch_pwr8 && !has_arch_pwr9.
* lib/target-supports.exp (check_effective_target_p8): Remove.
(check_effective_target_p9+): Remove.
GCC Administrator [Fri, 4 Sep 2020 00:16:32 +0000 (00:16 +0000)]
Daily bump.
Martin Jambor [Thu, 3 Sep 2020 20:43:49 +0000 (22:43 +0200)]
sra: Avoid SRAing if there is an aout-of-bounds access (PR 96820)
The testcase causes and ICE in the SRA verifier on x86_64 when
compiling with -m32 because build_user_friendly_ref_for_offset looks
at an out-of-bounds array_ref within an array_ref which accesses an
offset which does not fit into a signed 32bit integer and turns it
into an array-ref with a negative index.
The best thing is probably to bail out early when encountering an out
of bounds access to a local stack-allocated aggregate (and let the DSE
just delete such statements) which is what the patch does.
I also glanced over to the initial candidate vetting routine to make
sure the size would fit into HWI and noticed that it uses unsigned
variants whereas the rest of SRA operates on signed offsets and
sizes (because get_ref_and_extent does) and so changed that for the
sake of consistency. These ancient checks operate on sizes of types
as opposed to DECLs but I hope that any issues potentially arising
from that are basically hypothetical.
gcc/ChangeLog:
2020-08-28 Martin Jambor <mjambor@suse.cz>
PR tree-optimization/96820
* tree-sra.c (create_access): Disqualify candidates with accesses
beyond the end of the original aggregate.
(maybe_add_sra_candidate): Check that candidate type size fits
signed uhwi for the sake of consistency.
gcc/testsuite/ChangeLog:
2020-08-28 Martin Jambor <mjambor@suse.cz>
PR tree-optimization/96820
* gcc.dg/tree-ssa/pr96820.c: New test.
Will Schmidt [Mon, 20 Jul 2020 15:51:37 +0000 (10:51 -0500)]
[PATCH, rs6000] Fix vector long long subtype (PR96139)
Hi,
This corrects an issue with the powerpc vector long long subtypes.
As reported by SjMunroe, when building some code with -Wall, and
attempting to print an element of a "long long vector" with a
long long printf format string, we will report an error because
the vector sub-type was improperly defined as int.
When defining a V2DI_type_node we use a TARGET_POWERPC64 ternary to
define the V2DI_type_node with "vector long" or "vector long long".
We also need to specify the proper sub-type when we define the type.
PR target/96139
2020-09-03 Will Schmidt <will_schmidt@vnet.ibm.com>
gcc/ChangeLog:
* config/rs6000/rs6000-call.c (rs6000_init_builtin): Update V2DI_type_node
and unsigned_V2DI_type_node definitions.
gcc/testsuite/ChangeLog:
* gcc.target/powerpc/pr96139-a.c: New test.
* gcc.target/powerpc/pr96139-b.c: New test.
* gcc.target/powerpc/pr96139-c.c: New test.
Jakub Jelinek [Thu, 3 Sep 2020 19:53:40 +0000 (21:53 +0200)]
c++: Fix another PCH hash_map issue [PR96901]
The recent libstdc++ changes caused lots of libstdc++-v3 tests FAILs
on i686-linux, all of them in the same spot during constexpr evaluation
of a recursive _S_gcd call.
The problem is yet another hash_map that used the default hasing of
tree keys through pointer hashing which is preserved across PCH write/read.
During PCH handling, the addresses of GC objects are changed, which means
that the hash values of the keys in such hash tables change without those
hash tables being rehashed. Which in the fundef_copies_table case usually
means we just don't find a copy of a FUNCTION_DECL body for recursive uses
and start from scratch. But when the hash table keeps growing, the "dead"
elements in the hash table can sometimes reappear and break things.
In particular what I saw under the debugger is when the fundef_copies_table
hash map has been used on the outer _S_gcd call, it didn't find an entry for
it, so returned a slot with *slot == NULL, which is treated as that the
function itself is used directly (i.e. no recursion), but that addition of
a hash table slot caused the recursive _S_gcd call to actually find
something in the hash table, unfortunately not the new *slot == NULL spot,
but a different one from the pre-PCH streaming which contained the returned
toplevel (non-recursive) call entry for it, which means that for the
recursive _S_gcd call we actually used the same trees as for the outer ones
rather than a copy of those, which breaks constexpr evaluation.
2020-09-03 Jakub Jelinek <jakub@redhat.com>
PR c++/96901
* tree.h (struct decl_tree_traits): New type.
(decl_tree_map): New typedef.
* constexpr.c (fundef_copies_table): Change type from
hash_map<tree, tree> * to decl_tree_map *.
Harald Anlauf [Thu, 3 Sep 2020 18:33:14 +0000 (20:33 +0200)]
PR fortran/96890 - Wrong answer with intrinsic IALL
The IALL intrinsic would always return 0 when the DIM and MASK arguments
were present since the initial value of repeated BIT-AND operations was
set to 0 instead of -1.
libgfortran/ChangeLog:
* m4/iall.m4: Initial value for result should be -1.
* generated/iall_i1.c (miall_i1): Generated.
* generated/iall_i16.c (miall_i16): Likewise.
* generated/iall_i2.c (miall_i2): Likewise.
* generated/iall_i4.c (miall_i4): Likewise.
* generated/iall_i8.c (miall_i8): Likewise.
gcc/testsuite/ChangeLog:
* gfortran.dg/iall_masked.f90: New test.
Marek Polacek [Wed, 26 Aug 2020 12:27:33 +0000 (08:27 -0400)]
c++: Fix P0960 in member init list and array [PR92812]
This patch nails down the remaining P0960 case in PR92812:
struct A {
int ar[2];
A(): ar(1, 2) {} // doesn't work without this patch
};
Note that when the target object is not of array type, this already
works:
struct S { int x, y; };
struct A {
S s;
A(): s(1, 2) { } // OK in C++20
};
because build_new_method_call_1 takes care of the P0960 magic.
It proved to be quite hairy. When the ()-list has more than one
element, we can always create a CONSTRUCTOR, because the code was
previously invalid. But when the ()-list has just one element, it
gets all kinds of difficult. As usual, we have to handle a("foo")
so as not to wrap the STRING_CST in a CONSTRUCTOR. Always turning
x(e) into x{e} would run into trouble as in c++/93790. Another
issue was what to do about x({e}): previously, this would trigger
"list-initializer for non-class type must not be parenthesized".
I figured I'd make this work in C++20, so that given
struct S { int x, y; };
you can do
S a[2];
[...]
A(): a({1, 2}) // initialize a[0] with {1, 2} and a[1] with {}
It also turned out that, as an extension, we support compound literals:
F (): m((S[1]) { 1, 2 })
so this has to keep working as before. Moreover, make sure not to trigger
in compiler-generated code, like =default, where array assignment is allowed.
I've factored out a function that turns a TREE_LIST into a CONSTRUCTOR
to simplify handling of P0960.
paren-init35.C also tests this with vector types.
gcc/cp/ChangeLog:
PR c++/92812
* cp-tree.h (do_aggregate_paren_init): Declare.
* decl.c (do_aggregate_paren_init): New.
(grok_reference_init): Use it.
(check_initializer): Likewise.
* init.c (perform_member_init): Handle initializing an array from
a ()-list. Use do_aggregate_paren_init.
gcc/testsuite/ChangeLog:
PR c++/92812
* g++.dg/cpp0x/constexpr-array23.C: Adjust dg-error.
* g++.dg/cpp0x/initlist69.C: Likewise.
* g++.dg/diagnostic/mem-init1.C: Likewise.
* g++.dg/init/array28.C: Likewise.
* g++.dg/cpp2a/paren-init33.C: New test.
* g++.dg/cpp2a/paren-init34.C: New test.
* g++.dg/cpp2a/paren-init35.C: New test.
* g++.old-deja/g++.brendan/crash60.C: Adjust dg-error.
* g++.old-deja/g++.law/init10.C: Likewise.
* g++.old-deja/g++.other/array3.C: Likewise.
Jakub Jelinek [Thu, 3 Sep 2020 18:11:43 +0000 (20:11 +0200)]
c++: Disable -frounding-math during manifestly constant evaluation [PR96862]
As discussed in the PR, fold-const.c punts on floating point constant
evaluation if the result is inexact and -frounding-math is turned on.
/* Don't constant fold this floating point operation if the
result may dependent upon the run-time rounding mode and
flag_rounding_math is set, or if GCC's software emulation
is unable to accurately represent the result. */
if ((flag_rounding_math
|| (MODE_COMPOSITE_P (mode) && !flag_unsafe_math_optimizations))
&& (inexact || !real_identical (&result, &value)))
return NULL_TREE;
Jonathan said that we should be evaluating them anyway, e.g. conceptually
as if they are done with the default rounding mode before user had a chance
to change that, and e.g. in C in initializers it is also ignored.
In fact, fold-const.c for C initializers turns off various other options:
/* Perform constant folding and related simplification of initializer
expression EXPR. These behave identically to "fold_buildN" but ignore
potential run-time traps and exceptions that fold must preserve. */
int saved_signaling_nans = flag_signaling_nans;\
int saved_trapping_math = flag_trapping_math;\
int saved_rounding_math = flag_rounding_math;\
int saved_trapv = flag_trapv;\
int saved_folding_initializer = folding_initializer;\
flag_signaling_nans = 0;\
flag_trapping_math = 0;\
flag_rounding_math = 0;\
flag_trapv = 0;\
folding_initializer = 1;
flag_signaling_nans = saved_signaling_nans;\
flag_trapping_math = saved_trapping_math;\
flag_rounding_math = saved_rounding_math;\
flag_trapv = saved_trapv;\
folding_initializer = saved_folding_initializer;
So, shall cxx_eval_outermost_constant_expr instead turn off all those
options (then warning_sentinel wouldn't be the right thing to use, but given
the 8 or how many return stmts in cxx_eval_outermost_constant_expr, we'd
need a RAII class for this. Not sure about the folding_initializer, that
one is affecting complex multiplication and division constant evaluation
somehow.
2020-09-03 Jakub Jelinek <jakub@redhat.com>
PR c++/96862
* constexpr.c (cxx_eval_outermost_constant_expr): Temporarily disable
flag_rounding_math during manifestly constant evaluation.
* g++.dg/cpp1z/constexpr-96862.C: New test.
Jonathan Wakely [Thu, 3 Sep 2020 15:26:16 +0000 (16:26 +0100)]
libstdc++: Add workaround for weird std::tuple error [PR 96592]
This "fix" makes no sense, but it avoids an error from G++ about
std::is_constructible being incomplete. The real problem is elsewhere,
but this "fixes" the regression for now.
libstdc++-v3/ChangeLog:
PR libstdc++/96592
* include/std/tuple (_TupleConstraints<true, T...>): Use
alternative is_constructible instead of std::is_constructible.
* testsuite/20_util/tuple/cons/96592.cc: New test.
Jonathan Wakely [Thu, 3 Sep 2020 11:38:50 +0000 (12:38 +0100)]
libstdc++: Optimise GCD algorithms
The current std::gcd and std::chrono::duration::_S_gcd algorithms are
both recursive. This is potentially expensive to evaluate in constant
expressions, because each level of recursion makes a new copy of the
function to evaluate. The maximum number of steps is bounded
(proportional to the number of decimal digits in the smaller value) and
so unlikely to exceed the limit for constexpr nesting, but the memory
usage is still suboptimal. By using an iterative algorithm we avoid
that compile-time cost. Because looping in constexpr functions is not
allowed until C++14, we need to keep the recursive implementation in
duration::_S_gcd for C++11 mode.
For std::gcd we can also optimise runtime performance by using the
binary GCD algorithm.
libstdc++-v3/ChangeLog:
* include/std/chrono (duration::_S_gcd): Use iterative algorithm
for C++14 and later.
* include/std/numeric (__detail::__gcd): Replace recursive
Euclidean algorithm with iterative version of binary GCD algorithm.
* testsuite/26_numerics/gcd/1.cc: Test additional inputs.
* testsuite/26_numerics/gcd/gcd_neg.cc: Adjust dg-error lines.
* testsuite/26_numerics/lcm/lcm_neg.cc: Likewise.
* testsuite/experimental/numeric/gcd.cc: Test additional inputs.
* testsuite/26_numerics/gcd/2.cc: New test.
Jakub Jelinek [Thu, 3 Sep 2020 10:51:01 +0000 (12:51 +0200)]
lto: Cache location_ts including BLOCKs in GIMPLE streaming [PR94311]
As mentioned in the PR, when compiling valgrind even on fairly small
testcase where in one larger function the location keeps oscillating
between a small line number and 8000-ish line number in the same file
we very quickly run out of all possible location_t numbers and because of
that emit non-sensical line numbers in .debug_line.
There are ways how to decrease speed of depleting location_t numbers
in libcpp, but the main reason of this is that we use
stream_input_location_now for streaming in location_t for gimple_location
and phi arg locations. libcpp strongly prefers that the locations
it is given are sorted by the different files and by line numbers in
ascending order, otherwise it depletes quickly no matter what and is much
more costly (many extra file changes etc.).
The reason for not caching those were the BLOCKs that were streamed
immediately after the location and encoded into the locations (and for PHIs
we failed to stream the BLOCKs altogether).
This patch enhances the location cache to handle also BLOCKs (but not for
everything, only for the spots we care about the BLOCKs) and also optimizes
the size of the LTO stream by emitting a single bit into a pack whether the
BLOCK changed from last case and only streaming the BLOCK tree if it
changed.
2020-09-03 Jakub Jelinek <jakub@redhat.com>
PR lto/94311
* gimple.h (gimple_location_ptr, gimple_phi_arg_location_ptr): New
functions.
* streamer-hooks.h (struct streamer_hooks): Add
output_location_and_block callback. Fix up formatting for
output_location.
(stream_output_location_and_block): Define.
* lto-streamer.h (class lto_location_cache): Fix comment typo. Add
current_block member.
(lto_location_cache::input_location_and_block): New method.
(lto_location_cache::lto_location_cache): Initialize current_block.
(lto_location_cache::cached_location): Add block member.
(struct output_block): Add current_block member.
(lto_output_location): Formatting fix.
(lto_output_location_and_block): Declare.
* lto-streamer.c (lto_streamer_hooks_init): Initialize
streamer_hooks.output_location_and_block.
* lto-streamer-in.c (lto_location_cache::cmp_loc): Also compare
block members.
(lto_location_cache::apply_location_cache): Handle blocks.
(lto_location_cache::accept_location_cache,
lto_location_cache::revert_location_cache): Fix up function comments.
(lto_location_cache::input_location_and_block): New method.
(lto_location_cache::input_location): Implement using
input_location_and_block.
(input_function): Invoke apply_location_cache after streaming in all
bbs.
* lto-streamer-out.c (clear_line_info): Set current_block.
(lto_output_location_1): New function, moved from lto_output_location,
added block handling.
(lto_output_location): Implement using lto_output_location_1.
(lto_output_location_and_block): New function.
* gimple-streamer-in.c (input_phi): Use input_location_and_block
to input and cache both location and block.
(input_gimple_stmt): Likewise.
* gimple-streamer-out.c (output_phi): Use
stream_output_location_and_block.
(output_gimple_stmt): Likewise.
Richard Biener [Thu, 3 Sep 2020 10:44:40 +0000 (12:44 +0200)]
Improve constant folding of vector lowering with vector bools
This improves the situation somewhat when vector lowering tries
to access vector bools as seen in PR96814.
2020-09-03 Richard Biener <rguenther@suse.de>
* tree-vect-generic.c (tree_vec_extract): Remove odd
special-casing of boolean vectors.
* fold-const.c (fold_ternary_loc): Handle boolean vector
type BIT_FIELD_REFs.
Arnaud Charlet [Thu, 3 Sep 2020 08:34:48 +0000 (04:34 -0400)]
Preliminary work on support for 128bits integers
* fe.h, opt.ads (Enable_128bit_Types): New.
* stand.ads (Standard_Long_Long_Long_Integer,
S_Long_Long_Long_Integer): New.
Arnaud Charlet [Thu, 3 Sep 2020 07:38:40 +0000 (03:38 -0400)]
Look at fullest view when checking for static types in unnesting
When seeing if any bound involved in a type is an uplevel reference,
we must look at the fullest view of a type, since that's what the
backends will do. Similarly for private types. We introduce
Get_Fullest_View for that purpose.
* sem_util.ads, sem_util.adb (Get_Fullest_View): New procedure.
* exp_unst.adb (Check Static_Type): Do all processing on fullest
view of specified type.
liuhongt [Wed, 8 Jul 2020 09:14:36 +0000 (17:14 +0800)]
Optimize memory broadcast for constant vector under AVX512.
For constant vector having one duplicated value, there's no need to put
whole vector in the constant pool, using embedded broadcast instead.
2020-07-09 Hongtao Liu <hongtao.liu@intel.com>
gcc/ChangeLog:
PR target/87767
* config/i386/i386-features.c
(replace_constant_pool_with_broadcast): New function.
(constant_pool_broadcast): Ditto.
(class pass_constant_pool_broadcast): New pass.
(make_pass_constant_pool_broadcast): Ditto.
(remove_partial_avx_dependency): Call
replace_constant_pool_with_broadcast under TARGET_AVX512F, it
would save compile time when both pass rpad and cpb are
available.
(remove_partial_avx_dependency_gate): New function.
(class pass_remove_partial_avx_dependency::gate): Call
remove_partial_avx_dependency_gate.
* config/i386/i386-passes.def: Insert new pass after combine.
* config/i386/i386-protos.h
(make_pass_constant_pool_broadcast): Declare.
* config/i386/sse.md (*avx512dq_mul<mode>3<mask_name>_bcst):
New define_insn.
(*avx512f_mul<mode>3<mask_name>_bcst): Ditto.
* config/i386/avx512fintrin.h (_mm512_set1_ps,
_mm512_set1_pd,_mm512_set1_epi32, _mm512_set1_epi64): Adjusted.
gcc/testsuite/ChangeLog:
PR target/87767
* gcc.target/i386/avx2-broadcast-pr87767-1.c: New test.
* gcc.target/i386/avx512f-broadcast-pr87767-1.c: New test.
* gcc.target/i386/avx512f-broadcast-pr87767-2.c: New test.
* gcc.target/i386/avx512f-broadcast-pr87767-3.c: New test.
* gcc.target/i386/avx512f-broadcast-pr87767-4.c: New test.
* gcc.target/i386/avx512f-broadcast-pr87767-5.c: New test.
* gcc.target/i386/avx512f-broadcast-pr87767-6.c: New test.
* gcc.target/i386/avx512f-broadcast-pr87767-7.c: New test.
* gcc.target/i386/avx512vl-broadcast-pr87767-1.c: New test.
* gcc.target/i386/avx512vl-broadcast-pr87767-1.c: New test.
* gcc.target/i386/avx512vl-broadcast-pr87767-2.c: New test.
* gcc.target/i386/avx512vl-broadcast-pr87767-3.c: New test.
* gcc.target/i386/avx512vl-broadcast-pr87767-4.c: New test.
* gcc.target/i386/avx512vl-broadcast-pr87767-5.c: New test.
* gcc.target/i386/avx512vl-broadcast-pr87767-6.c: New test.
liuhongt [Mon, 31 Aug 2020 02:54:13 +0000 (10:54 +0800)]
Adjust testcase.
gcc/testsuite/ChangeLog:
PR target/96246
PR target/96855
PR target/96856
PR target/96857
* g++.target/i386/avx512bw-pr96246-2.C: Add runtime check for
AVX512BW.
* g++.target/i386/avx512vl-pr96246-2.C: Add runtime check for
AVX512BW and AVX512VL
* g++.target/i386/avx512f-helper.h: New header.
* gcc.target/i386/pr92658-avx512f.c: Add
-mprefer-vector-width=512 to avoid impact of different default
mtune which gcc is built with.
* gcc.target/i386/avx512bw-pr95488-1.c: Ditto.
* gcc.target/i386/pr92645-4.c: Add -mno-avx512f to avoid
impact of different default march which gcc is built with.
GCC Administrator [Thu, 3 Sep 2020 00:16:26 +0000 (00:16 +0000)]
Daily bump.
Iain Buclaw [Mon, 31 Aug 2020 20:42:10 +0000 (22:42 +0200)]
d: __vectors unsupported in hardware should be rejected at compile-time.
gcc/d/ChangeLog:
PR d/96869
* d-builtins.cc (build_frontend_type): Don't expose intrinsics that
use unsupported vector types.
* d-target.cc (Target::isVectorTypeSupported): Restrict to supporting
only if TARGET_VECTOR_MODE_SUPPORTED_P is true. Don't allow complex
or boolean vector types.
gcc/testsuite/ChangeLog:
PR d/96869
* gdc.dg/simd.d: Removed.
* gdc.dg/cast1.d: New test.
* gdc.dg/gdc213.d: Compile with target vect_sizes_16B_8B.
* gdc.dg/gdc284.d: Likewise.
* gdc.dg/gdc67.d: Likewise.
* gdc.dg/pr96869.d: New test.
* gdc.dg/simd1.d: New test.
* gdc.dg/simd10447.d: New test.
* gdc.dg/simd12776.d: New test.
* gdc.dg/simd13841.d: New test.
* gdc.dg/simd13927.d: New test.
* gdc.dg/simd15123.d: New test.
* gdc.dg/simd15144.d: New test.
* gdc.dg/simd16087.d: New test.
* gdc.dg/simd16697.d: New test.
* gdc.dg/simd17237.d: New test.
* gdc.dg/simd17695.d: New test.
* gdc.dg/simd17720a.d: New test.
* gdc.dg/simd17720b.d: New test.
* gdc.dg/simd19224.d: New test.
* gdc.dg/simd19627.d: New test.
* gdc.dg/simd19628.d: New test.
* gdc.dg/simd19629.d: New test.
* gdc.dg/simd19630.d: New test.
* gdc.dg/simd2a.d: New test.
* gdc.dg/simd2b.d: New test.
* gdc.dg/simd2c.d: New test.
* gdc.dg/simd2d.d: New test.
* gdc.dg/simd2e.d: New test.
* gdc.dg/simd2f.d: New test.
* gdc.dg/simd2g.d: New test.
* gdc.dg/simd2h.d: New test.
* gdc.dg/simd2i.d: New test.
* gdc.dg/simd2j.d: New test.
* gdc.dg/simd7951.d: New test.
* gdc.dg/torture/array2.d: New test.
* gdc.dg/torture/array3.d: New test.
* gdc.dg/torture/simd16488a.d: New test.
* gdc.dg/torture/simd16488b.d: New test.
* gdc.dg/torture/simd16703.d: New test.
* gdc.dg/torture/simd19223.d: New test.
* gdc.dg/torture/simd19607.d: New test.
* gdc.dg/torture/simd3.d: New test.
* gdc.dg/torture/simd4.d: New test.
* gdc.dg/torture/simd7411.d: New test.
* gdc.dg/torture/simd7413a.d: New test.
* gdc.dg/torture/simd7413b.d: New test.
* gdc.dg/torture/simd7414.d: New test.
* gdc.dg/torture/simd9200.d: New test.
* gdc.dg/torture/simd9304.d: New test.
* gdc.dg/torture/simd9449.d: New test.
* gdc.dg/torture/simd9910.d: New test.
Iain Buclaw [Mon, 31 Aug 2020 17:27:15 +0000 (19:27 +0200)]
d: Only test with default permutation flags for runnable tests.
Unless the test explicitly requests, all compilable tests as well as
fail_compilation tests will be ran without any extra flags.
The C++ tests now are checked against shared D runtime library.
gcc/testsuite/ChangeLog:
* lib/gdc-utils.exp (gdc-convert-test): Handle LINK directive.
Set PERMUTE_ARGS as DEFAULT_DFLAGS only for runnable tests.
(gdc-do-test): Set default action of compilable tests to compile.
Test SHARED_OPTION on runnable_cxx tests.
Iain Buclaw [Mon, 31 Aug 2020 16:23:12 +0000 (18:23 +0200)]
d: Move all runnable tests in gdc.dg to gdc.dg/torture
Tests that are not executed do not need to be compiled as torture tests,
they are only present for testing for a certain bug or ICE.
gcc/testsuite/ChangeLog:
* gdc.dg/dg.exp: Remove torture options.
* gdc.dg/gdc115.d: Move test to gdc.dg/torture.
* gdc.dg/gdc131.d: Likewise.
* gdc.dg/gdc141.d: Likewise.
* gdc.dg/gdc17.d: Likewise.
* gdc.dg/gdc171.d: Likewise.
* gdc.dg/gdc179.d: Likewise.
* gdc.dg/gdc186.d: Likewise.
* gdc.dg/gdc187.d: Likewise.
* gdc.dg/gdc191.d: Likewise.
* gdc.dg/gdc198.d: Likewise.
* gdc.dg/gdc200.d: Likewise.
* gdc.dg/gdc210.d: Likewise.
* gdc.dg/gdc240.d: Likewise.
* gdc.dg/gdc242b.d: Likewise.
* gdc.dg/gdc248.d: Likewise.
* gdc.dg/gdc250.d: Likewise.
* gdc.dg/gdc273.d: Likewise.
* gdc.dg/gdc283.d: Likewise.
* gdc.dg/gdc285.d: Likewise.
* gdc.dg/gdc286.d: Likewise.
* gdc.dg/gdc309.d: Likewise.
* gdc.dg/gdc35.d: Likewise.
* gdc.dg/gdc36.d: Likewise.
* gdc.dg/gdc51.d: Likewise.
* gdc.dg/gdc57.d: Likewise.
* gdc.dg/gdc66.d: Likewise.
* gdc.dg/imports/gdc36.d: Likewise.
* gdc.dg/init1.d: Likewise.
* gdc.dg/pr92309.d: Likewise.
* gdc.dg/pr94424.d: Likewise.
* gdc.dg/pr94777b.d: Likewise.
* gdc.dg/pr96152.d: Likewise.
* gdc.dg/pr96153.d: Likewise.
* gdc.dg/pr96156.d: Likewise.
* gdc.dg/pr96157a.d: Likewise.
* gdc.dg/torture/torture.exp: New file.
Jonathan Wakely [Wed, 2 Sep 2020 17:51:28 +0000 (18:51 +0100)]
c++: Stop defining true, false and bool as macros in <stdbool.h>
Since r216679 these macros have only been defined in C++98 mode, rather
than all modes. That is permitted as a GNU extension because that header
doesn't exist in the C++ standard until C++11, so we can make it do
whatever we want for C++98. But as discussed in the PR c++/60304
comments, these macros shouldn't ever be defined for C++.
This patch removes the macro definitions for C++98 too.
The new test already passed for C++98 (and the conversion is ill-formed
in C++11 and later) so this new test is arguably unnecessary.
gcc/ChangeLog:
PR c++/60304
* ginclude/stdbool.h (bool, false, true): Never define for C++.
gcc/testsuite/ChangeLog:
PR c++/60304
* g++.dg/warn/Wconversion-null-5.C: New test.
Jonathan Wakely [Wed, 2 Sep 2020 17:37:17 +0000 (18:37 +0100)]
testsuite: Add missing <exception> header to testcase
This test no longer compiles because <new> stopped including
<exception>, so std::set_terminate is not defined.
gcc/testsuite/ChangeLog:
* g++.old-deja/g++.abi/cxa_vec.C: Include <exception> for
std::set_terminate.
Jonathan Wakely [Wed, 2 Sep 2020 16:20:37 +0000 (17:20 +0100)]
libstdc++: Fix test to use correct function
This was copied from a test for std::lcm but I forgot to change one of
the calls to use the experimental version of the function.
libstdc++-v3/ChangeLog:
PR libstdc++/92978
* testsuite/experimental/numeric/92978.cc: Use experimental::lcm
not std::lcm.