The x11/mate-terminal port randomly fails to build. See: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=279034 The itstool python script randomly core dumps when processing the Japanese help files. The randomness comes from ASLR. If I disable ASLR: # sysctl kern.elf64.aslr.enable=0 kern.elf64.aslr.enable: 1 -> 0 then the core dumps always happen. Somewhat accidentally, I found that removing the libxml2 local patch, patch-python_libxml.c, the failures go away. Our commit for this patch: https://cgit.freebsd.org/ports/commit/textproc/libxml2/files/patch-python_libxml.c?id=25e6f68a6661303c0c4f23a304d3e0f713e89e11 refers to this upstream bug: https://bugzilla.gnome.org/show_bug.cgi?id=789714 which has a similar proposed fix. There is some discussion that the bug should be fixed elsewhere and be made more comprehensive. Discussion was moved to upstream bug https://gitlab.gnome.org/GNOME/libxml2/-/issues/64. The latter bug was closed with upstream commit: https://gitlab.gnome.org/GNOME/libxml2/-/commit/76c6da420923f2721a2e16adfcef8707a2454a1b which was included in upstream release 2.11.0: https://gitlab.gnome.org/GNOME/libxml2/-/commits/v2.11.0?ref_type=tags Since the bug is fixed upsream in the version we are building, we should not need the local patch. The local patch *should* just be redundant. I have no idea why it seems to be harmful.
This is the patched version of python/libxml.c: static void libxml_xmlErrorFuncHandler(ATTRIBUTE_UNUSED void *ctx, const char *msg, ...) { va_list ap; PyObject *list; PyObject *message; PyObject *result; char str[1000]; unsigned char *ptr = (unsigned char *)str; #ifdef DEBUG_ERROR printf("libxml_xmlErrorFuncHandler(%p, %s, ...) called\n", ctx, msg); #endif #if PY_MAJOR_VERSION >= 3 /* Ensure the error string doesn't start at UTF8 continuation. */ while (*ptr && (*ptr & 0xc0) == 0x80) ptr++; #endif The problem is that at this point, buf[] is still uninitialized and just contains stack garbage. If it doesn't contain any NUL bytes, the loop can walk ptr off the end of the array. Even if this doesn't happen, when ptr is used later, it will have a nonsense value.
Created attachment 260608 [details] patch to fix random py-libxml2 SEGFAULT
Created attachment 260611 [details] patch to fix random py-libxml2 SEGFAULT
(In reply to Don Lewis from comment #3) I can confirm that in one of my use cases random segfaults stopped in command: /usr/local/bin/python3.11 ./tools/xml2po.py <other options here> After rebuild of both textproc/py-libxml2 and textproc/libxml2 with patch attached (rm textproc/libxml2/files/patch-python_libxml.c). No need to rebuild textproc/libxml2?
(In reply to Don Lewis from comment #3) I think we can commit this as a "Fix port" without waiting for maintainer approval or the 2 week timeout.
(In reply to Vladimir Druzenko from comment #4) Correct. Only textproc/py-libxml2 is affected by the patch.
(In reply to Don Lewis from comment #6) Thanks. Do you want to commit your patch yourself?
(In reply to Vladimir Druzenko from comment #7) I'd prefer that desktop@ handle it, or wait for the timeout.
(In reply to Don Lewis from comment #8) Please coordinate with @diizzy, he works on libxml2 update. If he is not going to land his work soon, feel free to commit the fix.
(In reply to Max Brazhnikov from comment #9) We know about this PR: https://bugs.freebsd.org/279705 - added to "See Also". It has 14 blocking PRs - I don't think that update will be committed soon. By the way, his patch contains this patch too.
A commit in branch main references this bug: URL: https://cgit.FreeBSD.org/ports/commit/?id=d5b2d60fc76964a3e5b8f50d2992bd751f688f04 commit d5b2d60fc76964a3e5b8f50d2992bd751f688f04 Author: Don Lewis <truckman@FreeBSD.org> AuthorDate: 2025-05-23 21:48:34 +0000 Commit: Don Lewis <truckman@FreeBSD.org> CommitDate: 2025-05-23 22:17:11 +0000 textproc/py-pylibxml2: rm patch that scans garbage Remove a local patch that scans stack garbage for a pattern and then uses the pointer after the array has been filled with actual data. This can cause random segfaults. The patch is under textproc/libxml2, but the patched file is only used by textproc/py-pylibxml2. The UTF-8 issue this was intended to fix was fixed upstream in a more comprehensive way in 2.11.0 with this commit: https://gitlab.gnome.org/GNOME/libxml2/-/commit/76c6da420923f2721a2e16adfcef8707a2454a1b PR: 286439 279034 Approved by: desktop (makc) Tested by: vvd MFH: 2025Q2 .../libxml2/files/patch-python_libxml.c (gone) | 35 ---------------------- textproc/py-libxml2/Makefile | 2 +- 2 files changed, 1 insertion(+), 36 deletions(-)
A commit in branch 2025Q2 references this bug: URL: https://cgit.FreeBSD.org/ports/commit/?id=0b3da26d11cf4048be6143b8b8684d737d0160ab commit 0b3da26d11cf4048be6143b8b8684d737d0160ab Author: Don Lewis <truckman@FreeBSD.org> AuthorDate: 2025-05-23 21:48:34 +0000 Commit: Don Lewis <truckman@FreeBSD.org> CommitDate: 2025-05-23 22:23:25 +0000 textproc/py-pylibxml2: rm patch that scans garbage Remove a local patch that scans stack garbage for a pattern and then uses the pointer after the array has been filled with actual data. This can cause random segfaults. The patch is under textproc/libxml2, but the patched file is only used by textproc/py-pylibxml2. The UTF-8 issue this was intended to fix was fixed upstream in a more comprehensive way in 2.11.0 with this commit: https://gitlab.gnome.org/GNOME/libxml2/-/commit/76c6da420923f2721a2e16adfcef8707a2454a1b PR: 286439 279034 Approved by: desktop (makc) Tested by: vvd MFH: 2025Q2 (cherry picked from commit d5b2d60fc76964a3e5b8f50d2992bd751f688f04) .../libxml2/files/patch-python_libxml.c (gone) | 35 ---------------------- textproc/py-libxml2/Makefile | 2 +- 2 files changed, 1 insertion(+), 36 deletions(-)
Created attachment 260665 [details] proper fix for UTF-8 bug The reason the patch currently causes py-libxml to crash is that it was mismerged when libxml was upgraded to 2.11.6. The “ensure the error string doesn't start at UTF8 continuation” bit was supposed to happen after the `vsnprintf()` call. For unexplained reasons, it was moved to _before_ that call where, as others have pointed out, it operates on uninitialized data. However, this patch was never the correct fix. The bug occurs when `str` is too small for the message and the message is truncated in the middle of a UTF-8 sequence, so you need to trim the _end_ of `str`, but what the patch does is skip any partial UTF-8 sequences that occur at the _beginning_ of `str`, which I'm not sure is even possible unless the format string or some of the arguments passed to the handler are malformed. If you don't want to reintroduce the original bug, I suggest you use the attached patch instead of just dropping patch-python_libxml.c.
(In reply to Dag-Erling Smørgrav from comment #13) Yes, our version of the patch was mis-merged as compared to the Fedora patch for 2.9.x: https://src.fedoraproject.org/rpms/libxml2/blob/f32/f/libxml2-2.9.8-python3-unicode-errors.patch It appears that they have retained this patch through the present. At first glance, you patch appears to be correct, though nobody else seems to see the need. I can't complain about adding some defensive programming, but is should probably be a new ticket and/or sent upstream.
2.14 is not going to be committed anytime soon, with the consumer build failures (mostly stemming from still including deprecated/removed headers and symbols) being the biggest reason. Other reasons include continued use of cmake, which is not what upstream recommends for platforms like us (not to mention support is still not at par with autotools). My local copy of 2.14, which uses autotools, and particularly for the Python port (for which the correct port name is textproc/libxml2-python, derived from the Python package name), builds using Python's build process which is generated from autotools. I will investigate whether this patch is still needed, but I dropped this patch from my copy because it didn't seem right. (In reply to Max Brazhnikov from comment #9) This is a desktop@ port, not a diizzy@ port.
(In reply to Dag-Erling Smørgrav from comment #13) I suspect that the 1000 byte buffer is considered to be plenty long. The upstream commit for the original problem mentions that this path is for error messages. https://gitlab.gnome.org/GNOME/libxml2/-/commit/76c6da420923f2721a2e16adfcef8707a2454a1b
The patch isn't needed upstream because they already addressed the issue by not crashing if an incomplete UTF-8 sequence is encountered. It may still be needed here because we won't have that (more invasive) upstream fix until libxml2 is updated.
(In reply to Dag-Erling Smørgrav from comment #17) Upstream released their patch in 2.11.0 and our port is 2.11.9, so we've got the fix already. We aren't waiting on 2.14.