gdb: improve reuse of value contents when fetching array elements
While working on a Python script, which was interacting with a remote
target, I noticed some weird slowness in GDB. In my program I had a
structure something like this:
struct foo_t
{
int array[5];
};
struct foo_t global_foo;
Then in the Python script I was fetching a complete copy of global
foo, like:
val = gdb.parse_and_eval('global_foo')
val.fetch_lazy()
Then I would work with items in foo_t.array, like:
print(val['array'][1])
I called the fetch_lazy method specifically because I knew I was going
to end up accessing almost all of the contents of val, and so I wanted
GDB to do a single remote protocol call to fetch all the contents in
one go, rather than trying to do lazy fetches for a couple of bytes at
a time.
What I observed was that, after the fetch_lazy call, GDB does,
correctly, fetch the entire contents of global_foo, including all of
the contents of array, however, when I access val.array[1], GDB still
goes and fetches the value of this element from the remote target.
What's going on is that in valarith.c, in value_subscript, for C like
languages, we always end up treating the array value as a pointer, and
then doing value_ptradd, and value_ind, the second of these calls
always returns a lazy value.
My guess is that this approach allows us to handle indexing off the
end of an array, when working with zero element arrays, or when
indexing a raw pointer as an array. And, I agree, that in these
cases, where, even when the original value is non-lazy, we still will
not have the content of the array loaded, we should be using the
value_ind approach.
However, for cases where we do have the array contents loaded, and we
do know the bounds of the array, I think we should be using
value_subscripted_rvalue, which is what we use for non C like
languages.
One problem I did run into, exposed by gdb.base/charset.exp, was that
value_subscripted_rvalue stripped typedefs from the element type of
the array, which means the value returned will not have the same type
as an element of the array, but would be the raw, non-typedefed,
type. In charset.exp we got back an 'int' instead of a
'wchar_t' (which is a typedef of 'int'), and this impacts how we print
the value. Removing typedefs from the resulting value just seems
wrong, so I got rid of that, and I don't see any test regressions.
With this change in place, my original Python script is now doing no
additional memory accesses, and its performance increases about 10x!