Arm: Fix performance issue with thumb-2 tailcalls
We currently use a padding NOP after a Thumb to Arm interworking veneer (BX pc).
The NOP is never executed but may result in a performance penalty on some cores.
For this reason this patch changes the NOPs after Thumb to Arm veneers into B .-2
and adds a note to this in the source code for future reference.
bfd/ChangeLog:
* elf32-arm.c (elf32_thumb2_plt_entry, elf32_arm_plt_thumb_stub,
elf32_arm_stub_long_branch_v4t_thumb_thumb,
elf32_arm_stub_long_branch_v4t_thumb_arm,
elf32_arm_stub_short_branch_v4t_thumb_arm,
elf32_arm_stub_long_branch_v4t_thumb_arm_pic,
elf32_arm_stub_long_branch_v4t_thumb_thumb_pic,
elf32_arm_stub_long_branch_v4t_thumb_tls_pic): Change nop to branch to
previous instruction.
ld/ChangeLog:
* testsuite/ld-arm/cortex-a8-fix-b-plt.d: Update Testcase.
* testsuite/ld-arm/cortex-a8-fix-b-rel-arm.d: Likewise.
* testsuite/ld-arm/cortex-a8-fix-bcc-plt.d: Likewise.
* testsuite/ld-arm/farcall-cond-thumb-arm.d: Likewise.
* testsuite/ld-arm/farcall-mixed-app.d: Likewise.
* testsuite/ld-arm/farcall-mixed-app2.d: Likewise.
* testsuite/ld-arm/farcall-mixed-lib-v4t.d: Likewise.
* testsuite/ld-arm/farcall-thumb-arm-pic-veneer.d: Likewise.
* testsuite/ld-arm/farcall-thumb-arm-short.d: Likewise.
* testsuite/ld-arm/farcall-thumb-arm.d: Likewise.
* testsuite/ld-arm/farcall-thumb-thumb-pic-veneer.d: Likewise.
* testsuite/ld-arm/farcall-thumb-thumb.d: Likewise.
* testsuite/ld-arm/fix-arm1176-on.d: Likewise.
* testsuite/ld-arm/ifunc-10.dd: Likewise.
* testsuite/ld-arm/ifunc-2.dd: Likewise.
* testsuite/ld-arm/ifunc-4.dd: Likewise.
* testsuite/ld-arm/ifunc-6.dd: Likewise.
* testsuite/ld-arm/ifunc-8.dd: Likewise.
* testsuite/ld-arm/jump-reloc-veneers-long.d: Likewise.
* testsuite/ld-arm/mixed-app.d: Likewise.
* testsuite/ld-arm/thumb2-b-interwork.d: Likewise.
* testsuite/ld-arm/tls-longplt.d: Likewise.
* testsuite/ld-arm/tls-thumb1.d: Likewise.
26 files changed: