Fix Python3.9 related runtime problems
authorKevin Buettner <kevinb@redhat.com>
Thu, 28 May 2020 03:05:40 +0000 (20:05 -0700)
committerKevin Buettner <kevinb@redhat.com>
Thu, 28 May 2020 19:46:16 +0000 (12:46 -0700)
commitc47bae859a5af0d95224d90000df0e529f7c5aa0
tree4073a61c1ca5aa65bb7fdb9e40b46b28a941a0a9
parent66e3eb08a52ba20d3fb468cef04952aafdf534d4
Fix Python3.9 related runtime problems

Python3.9b1 is now available on Rawhide.  GDB w/ Python 3.9 support
can be built using the configure switch -with-python=/usr/bin/python3.9.

Attempting to run gdb/Python3.9 segfaults on startup:

    #0  0x00007ffff7b0582c in PyEval_ReleaseLock () from /lib64/libpython3.9.so.1.0
    #1  0x000000000069ccbf in do_start_initialization ()
at worktree-test1/gdb/python/python.c:1789
    #2  _initialize_python ()
at worktree-test1/gdb/python/python.c:1877
    #3  0x00000000007afb0a in initialize_all_files () at init.c:237
    ...

Consulting the the documentation...

https://docs.python.org/3/c-api/init.html

...we find that PyEval_ReleaseLock() has been deprecated since version
3.2.  It recommends using PyEval_SaveThread or PyEval_ReleaseThread()
instead.  In do_start_initialization, in gdb/python/python.c, we
can replace the calls to PyThreadState_Swap() and PyEval_ReleaseLock()
with a single call to PyEval_SaveThread.   (Thanks to Keith Seitz
for working this out.)

With that in place, GDB gets a little bit further.  It still dies
on startup, but the backtrace is different:

    #0  0x00007ffff7b04306 in PyOS_InterruptOccurred ()
       from /lib64/libpython3.9.so.1.0
    #1  0x0000000000576e86 in check_quit_flag ()
at worktree-test1/gdb/extension.c:776
    #2  0x0000000000576f8a in set_active_ext_lang (now_active=now_active@entry=0x983c00 <extension_language_python>)
at worktree-test1/gdb/extension.c:705
    #3  0x000000000069d399 in gdbpy_enter::gdbpy_enter (this=0x7fffffffd2d0,
gdbarch=0x0, language=0x0)
at worktree-test1/gdb/python/python.c:211
    #4  0x0000000000686e00 in python_new_inferior (inf=0xddeb10)
at worktree-test1/gdb/python/py-inferior.c:251
    #5  0x00000000005d9fb9 in std::function<void (inferior*)>::operator()(inferior*) const (__args#0=<optimized out>, this=0xccad20)
at /usr/include/c++/10/bits/std_function.h:617
    #6  gdb::observers::observable<inferior*>::notify (args#0=0xddeb10,
this=<optimized out>)
at worktree-test1/gdb/../gdbsupport/observable.h:106
    #7  add_inferior_silent (pid=0)
at worktree-test1/gdb/inferior.c:113
    #8  0x00000000005dbcb8 in initialize_inferiors ()
at worktree-test1/gdb/inferior.c:947
    ...

We checked with some Python Developers and were told that we should
acquire the GIL prior to calling any Python C API function.  We
definitely don't have the GIL for calls of PyOS_InterruptOccurred().

I moved class_gdbpy_gil earlier in the file and use it in
gdbpy_check_quit_flag() to acquire (and automatically release) the
GIL.

With those changes in place, I was able to run to a GDB prompt.  But,
when trying to quit, it segfaulted again due to due to some other
problems with gdbpy_check_quit_flag():

    Thread 1 "gdb" received signal SIGSEGV, Segmentation fault.
    0x00007ffff7bbab0c in new_threadstate () from /lib64/libpython3.9.so.1.0
    (top-gdb) bt 8
    #0  0x00007ffff7bbab0c in new_threadstate () from /lib64/libpython3.9.so.1.0
    #1  0x00007ffff7afa5ea in PyGILState_Ensure.cold ()
       from /lib64/libpython3.9.so.1.0
    #2  0x000000000069b58c in gdbpy_gil::gdbpy_gil (this=<synthetic pointer>)
at worktree-test1/gdb/python/python.c:278
    #3  gdbpy_check_quit_flag (extlang=<optimized out>)
at worktree-test1/gdb/python/python.c:278
    #4  0x0000000000576e96 in check_quit_flag ()
at worktree-test1/gdb/extension.c:776
    #5  0x000000000057700c in restore_active_ext_lang (previous=0xe9c050)
at worktree-test1/gdb/extension.c:729
    #6  0x000000000088913a in do_my_cleanups (
pmy_chain=0xc31870 <final_cleanup_chain>,
old_chain=0xae5720 <sentinel_cleanup>)
at worktree-test1/gdbsupport/cleanups.cc:131
    #7  do_final_cleanups ()
at worktree-test1/gdbsupport/cleanups.cc:143

In this case, we're trying to call a Python C API function after
Py_Finalize() has been called from finalize_python().  I made
finalize_python set gdb_python_initialized to false and then cause
check_quit_flag() to return early when it's false.

With these changes in place, GDB seems to be working again with
Python3.9b1.  I think it likely that there are other problems lurking.
I wouldn't be surprised to find that there are other calls into Python
where we don't first make sure that we have the GIL.  Further changes
may well be needed.

I see no regressions testing on Rawhide using a GDB built with the
default Python version (3.8.3) versus one built using Python 3.9b1.

I've also tested on Fedora 28, 29, 30, 31, and 32 (all x86_64) using
the default (though updated) system installed versions of Python on
those OSes.  This means that I've tested against Python versions
2.7.15, 2.7.17, 2.7.18, 3.7.7, 3.8.2, and 3.8.3.  In each case GDB
still builds without problem and shows no regressions after applying
this patch.

gdb/ChangeLog:

2020-MM-DD  Kevin Buettner  <kevinb@redhat.com>
    Keith Seitz  <keiths@redhat.com>

* python/python.c (do_start_initialization): For Python 3.9 and
later, call PyEval_SaveThread instead of PyEval_ReleaseLock.
(class gdbpy_gil): Move to earlier in file.
(finalize_python): Set gdb_python_initialized.
(gdbpy_check_quit_flag): Acquire GIL via gdbpy_gil.  Return early
when not initialized.
gdb/ChangeLog
gdb/python/python.c