Bug 241951 - databases/postgresql12-plpython: python3.6m crash database after ssh session close
Summary: databases/postgresql12-plpython: python3.6m crash database after ssh session ...
Status: Closed Overcome By Events
Alias: None
Product: Ports & Packages
Classification: Unclassified
Component: Individual Port(s) (show other bugs)
Version: Latest
Hardware: amd64 Any
: --- Affects Some People
Assignee: pgsql
Depends on:
Reported: 2019-11-13 17:19 UTC by Anton Krutikov
Modified: 2020-02-15 16:45 UTC (History)
3 users (show)

See Also:
bugzilla: maintainer-feedback? (pgsql)


Note You need to log in before you can comment on or make changes to this bug.
Description Anton Krutikov 2019-11-13 17:19:48 UTC
PostgreSQL 12

In pkg and ports postgresql uses libpython3.6m.so.1.0 by default and this is buged for plpython I think.

Problem scenario:

1. Connect via ssh or use ssh tunnel via pgadmin4
2. Execute any plpython3u function via psql or pgadmin4 - it's will be OK
3. Close ssh session
4. Repeat steps 1-2 and now postgresql server will crash and autorecovery on every call to function which use plpython3u until manualy "service postgresql restart"

Some digging:

After each crash in log file:

LOG:  server process (PID 17414) was terminated by signal 6: Abort trap
DETAIL:  Failed process was running: select pyver();
LOG:  database system was not properly shut down; automatic recovery in progress
LOG:  redo starts at 0/1787B98
LOG:  invalid record length at 0/1787BD0: wanted 24, got 0
LOG:  redo done at 0/1787B98
LOG:  database system is ready to accept connections

Using gdb and postgres.core can see:

Program terminated with signal SIGABRT, Aborted.
#0  0x000000080142745a in thr_kill () from /lib/libc.so.7
(gdb) bt
#0  0x000000080142745a in thr_kill () from /lib/libc.so.7
#1  0x0000000801425844 in raise () from /lib/libc.so.7
#2  0x0000000801398079 in abort () from /lib/libc.so.7
#3  0x000000080b103535 in Py_FatalError () from /usr/local/lib/libpython3.6m.so.1.0
#4  0x000000080b1032eb in _Py_InitializeEx_Private () from /usr/local/lib/libpython3.6m.so.1.0
#5  0x000000080af0c920 in PLy_initialize () from /usr/local/lib/postgresql/plpython3.so
#6  0x000000080af0caaa in plpython3_call_handler () from /usr/local/lib/postgresql/plpython3.so
#7  0x000000000063cf90 in ?? ()
#8  0x000000000066bad8 in ?? ()
#9  0x0000000000643c2d in standard_ExecutorRun ()
#10 0x000000000079d68d in ?? ()
#11 0x000000000079d272 in PortalRun ()
#12 0x000000000079c247 in ?? ()
#13 0x000000000079a27d in PostgresMain ()
#14 0x000000000071f836 in ?? ()
#15 0x000000000071eef0 in ?? ()
#16 0x000000000071c099 in PostmasterMain ()
#17 0x00000000006906de in main ()

Maybe problem in python version with "m" which mean compiled with "--with-pymalloc", some version of memory allocation different from default. Can't explain how ssh session affect on this.

I can fix it only with using different python version on make step

make PYTHON_VERSION=python3.8 install
Comment 1 Tom Lane 2019-11-14 16:43:25 UTC
FWIW, I tried and failed to reproduce this per the directions --- although I was using a hand-built copy of Postgres git tip, not the FreeBSD package, so maybe it's somehow specific to the build options for the package?  I suspect though that the report is missing some crucial detail about how to reproduce.

I tested using a freshly-updated 12.0-RELEASE-p12/amd64 system.  python is

Name           : python36
Version        : 3.6.9
Installed on   : Mon Oct 28 15:52:59 2019 EDT
Origin         : lang/python36
Architecture   : FreeBSD:12:amd64
Comment 2 Anton Krutikov 2019-11-14 18:58:02 UTC
(In reply to Tom Lane from comment #1)
I talk about version, that on ports and pkg now.
And your python version without "m" suffix (on dependencies in ports r3.6m )
Comment 3 Tom Lane 2019-11-14 19:39:13 UTC
(In reply to Anton Krutikov from comment #2)
> I talk about version, that on ports and pkg now.

Well, if it can't be reproduced with upstream Postgres, it's not my problem ;-)

> And your python version without "m" suffix (on dependencies in ports r3.6m )

I do not see any separate package named that according to "pkg search python", and the standard package that I'm using does appear to be built with the PYMALLOC option.  My build is linking to a libpython that seems to be named correctly:

$ ldd testversion/lib/postgresql/plpython3.so
        libpython3.6m.so.1.0 => /usr/local/lib/libpython3.6m.so.1.0 (0x800e00000)
        libc.so.7 => /lib/libc.so.7 (0x800248000)
        libthr.so.3 => /lib/libthr.so.3 (0x800692000)
        libintl.so.8 => /usr/local/lib/libintl.so.8 (0x8006bd000)
        libdl.so.1 => /usr/lib/libdl.so.1 (0x8006ca000)
        libutil.so.9 => /lib/libutil.so.9 (0x8006ce000)
        libm.so.5 => /lib/libm.so.5 (0x8006e5000)

It's possible, if you're using something that's not the regular python36 package, that this boils down to being an ABI compatibility problem between what you are using and what the postgresql package was built against.
Comment 4 Palle Girgensohn freebsd_committer 2020-02-15 12:06:28 UTC
@anton is this still a problem?
Comment 5 Anton Krutikov 2020-02-15 16:12:16 UTC
(In reply to Palle Girgensohn from comment #4)
After I rebuild it from ports with appropriate versions of dependencies - it's ok. And how I can see now - newer versions of pkg in upstream. This behavior was on default DigitalOcean droplet with freebsd and in hyper-v environment too.
Comment 6 Palle Girgensohn freebsd_committer 2020-02-15 16:40:51 UTC
Excellent that it works now! I'll close the PR then, OK?
Comment 7 Anton Krutikov 2020-02-15 16:45:07 UTC
(In reply to Palle Girgensohn from comment #6)
Yes, thank you!