Fix PR c++/21323: GDB thinks char16_t and char32_t are signed in C++
authorPedro Alves <palves@redhat.com>
Wed, 12 Apr 2017 13:00:49 +0000 (14:00 +0100)
committerPedro Alves <palves@redhat.com>
Wed, 12 Apr 2017 13:00:49 +0000 (14:00 +0100)
commit53e710acd249e1861029b19b7a3d8195e7f28929
tree38684d014c6f2c67e1dca970494d82e6e1788107
parent5e0e0422137063ff3846886c8eeb64e98e7669d6
Fix PR c++/21323: GDB thinks char16_t and char32_t are signed in C++

While the C++ standard says that char16_t and char32_t are unsigned types:

 Types char16_t and char32_t denote distinct types with the same size,
 signedness, and alignment as uint_least16_t and uint_least32_t,
 respectively, in <cstdint>, called the underlying types.

... gdb treats them as signed currently:

 (gdb) p (char16_t)-1
 $1 = -1 u'\xffff'

There are actually two places in gdb that hardcode these types:

- gdbtypes.c:gdbtypes_post_init, when creating the built-in types,
  seemingly used by the "x /s" command (judging from commit 9a22f0d0).

- dwarf2read.c, when reading base types with DW_ATE_UTF encoding
  (which is what is used for these types, when compiling for C++11 and
  up).  Despite the comment, the type created does end up used.

Both places need fixing.  But since I couldn't tell why dwarf2read.c
needs to create a new type, I've made it use the per-arch built-in
types instead, so that the types are only created once per arch
instead of once per objfile.  That seems to work fine.

While writting the test, I noticed that the C++ language parser isn't
actually aware of these built-in types, so if you try to use them
without a program that uses them, you get:

 (gdb) set language c++
 (gdb) ptype char16_t
 No symbol table is loaded.  Use the "file" command.
 (gdb) ptype u"hello"
 No type named char16_t.
 (gdb) p u"hello"
 No type named char16_t.

That's fixed by simply adding a couple entries to C++'s built-in types
array in c-lang.c.  With that, we get the expected:

 (gdb) ptype char16_t
 type = char16_t
 (gdb) ptype u"hello"
 type = char16_t [6]
 (gdb) p u"hello"
 $1 = u"hello"

gdb/ChangeLog:
2017-04-12  Pedro Alves  <palves@redhat.com>

PR c++/21323
* c-lang.c (cplus_primitive_types) <cplus_primitive_type_char16_t,
cplus_primitive_type_char32_t>: New enum values.
(cplus_language_arch_info): Register cplus_primitive_type_char16_t
and cplus_primitive_type_char32_t.
* dwarf2read.c (read_base_type) <DW_ATE_UTF>: If bit size is 16 or
32, use the archtecture's built-in type for char16_t and char32_t,
respectively.  Otherwise, fallback to init_integer_type as before,
but make the type unsigned, and issue a complaint.
* gdbtypes.c (gdbtypes_post_init): Make char16_t and char32_t unsigned.

gdb/testsuite/ChangeLog:
2017-04-12  Pedro Alves  <palves@redhat.com>

PR c++/21323
* gdb.cp/wide_char_types.c: New file.
* gdb.cp/wide_char_types.exp: New file.
gdb/ChangeLog
gdb/c-lang.c
gdb/dwarf2read.c
gdb/gdbtypes.c
gdb/testsuite/ChangeLog
gdb/testsuite/gdb.cp/wide_char_types.c [new file with mode: 0644]
gdb/testsuite/gdb.cp/wide_char_types.exp [new file with mode: 0644]