IBM Z: Try to make use of load-and-test instructions
This patch enables a peephole2 optimization which transforms a load of
constant zero into a temporary register which is then finally used to
compare against a floating-point register of interest into a single load
and test instruction. However, the optimization is only applied if both
registers are dead afterwards and if we test for (in)equality only.
This is relaxed in case of fast math.
This is a follow up to PR88856.
gcc/ChangeLog:
* config/s390/s390.md ("*cmp<mode>_ccs_0", "*cmp<mode>_ccz_0",
"*cmp<mode>_ccs_0_fastmath"): Basically change "*cmp<mode>_ccs_0" into
"*cmp<mode>_ccz_0" and for fast math add "*cmp<mode>_ccs_0_fastmath".
gcc/testsuite/ChangeLog:
* gcc.target/s390/load-and-test-fp-1.c: Change test to include all
possible combinations of dead/live registers and comparisons (equality,
relational).
* gcc.target/s390/load-and-test-fp-2.c: Same as load-and-test-fp-1.c
but for fast math.
* gcc.target/s390/load-and-test-fp.h: New test included by
load-and-test-fp-{1,2}.c.