As observed on a binary compiled on AMD64 Ubuntu 20.04, against glibc
2.31 (I think it's the libc that provides this startup code, right?),
there are enough bytes at the executable's entry point to hold more than
one displaced step buffer. gdbarch_max_insn_length is 16, and the
code at _start looks like:
0000000000001040 <_start>:
1040: f3 0f 1e fa endbr64
1044: 31 ed xor %ebp,%ebp
1046: 49 89 d1 mov %rdx,%r9
1049: 5e pop %rsi
104a: 48 89 e2 mov %rsp,%rdx
104d: 48 83 e4 f0 and $0xfffffffffffffff0,%rsp
1051: 50 push %rax
1052: 54 push %rsp
1053: 4c 8d 05 56 01 00 00 lea 0x156(%rip),%r8 # 11b0 <__libc_csu_fini>
105a: 48 8d 0d df 00 00 00 lea 0xdf(%rip),%rcx # 1140 <__libc_csu_init>
1061: 48 8d 3d c1 00 00 00 lea 0xc1(%rip),%rdi # 1129 <main>
1068: ff 15 72 2f 00 00 callq *0x2f72(%rip) # 3fe0 <__libc_start_main@GLIBC_2.2.5>
106e: f4 hlt
106f: 90 nop
The two buffers would occupy [0x1040, 0x1060).
I checked on Alpine, which uses the musl C library, the startup code
looks like:
0000000000001048 <_start>:
1048: 48 31 ed xor %rbp,%rbp
104b: 48 89 e7 mov %rsp,%rdi
104e: 48 8d 35 e3 2d 00 00 lea 0x2de3(%rip),%rsi # 3e38 <_DYNAMIC>
1055: 48 83 e4 f0 and $0xfffffffffffffff0,%rsp
1059: e8 00 00 00 00 callq 105e <_start_c>
000000000000105e <_start_c>:
105e: 48 8b 37 mov (%rdi),%rsi
1061: 48 8d 57 08 lea 0x8(%rdi),%rdx
1065: 45 31 c9 xor %r9d,%r9d
1068: 4c 8d 05 47 01 00 00 lea 0x147(%rip),%r8 # 11b6 <_fini>
106f: 48 8d 0d 8a ff ff ff lea -0x76(%rip),%rcx # 1000 <_init>
1076: 48 8d 3d 0c 01 00 00 lea 0x10c(%rip),%rdi # 1189 <main>
107d: e9 9e ff ff ff jmpq 1020 <__libc_start_main@plt>
Even though there's a _start_c symbol, it all appears to be code that
runs once at the very beginning of the program, so it looks fine if the
two buffers occupy [0x1048, 0x1068).
One important thing I discovered while doing this is that when debugging
a dynamically-linked executable, breakpoints in the shared library
loader are hit before executing the _start code, and these breakpoints
may be displaced-stepped. So it's very important that the buffer bytes
are restored properly after doing the displaced steps, otherwise the
_start code will be corrupted once we try to execute it.
Another thing that made me think about is that library constructors (as
in `__attribute__((constructor))`) run before _start. And they are free
to spawn threads. What if one of these threads executes a displaced
step, therefore changing the bytes at _start, while the main thread
executes _start? That doesn't sound good and I don't know how we could
prevent it. But this is a problem that predates the current patch.
Even when stress-testing the implementation, by making many threads do
displaced steps over and over, I didn't see a significant performance (I
confirmed that the two buffers were used by checking the "set debug
displaced" logs though). However, this patch mostly helps make the
feature testable by anybody with an AMD64/Linux machine, so I think it's
useful.
gdb/ChangeLog:
* amd64-linux-tdep.c (amd64_linux_init_abi): Pass 2 as the
number of displaced step buffers.
Change-Id: Ia0c96ea0fcda893f4726df6fdac7be5214620112