Bug 198788 - mail/postfix: Does not start/build anymore since upgrade to security/openssl 1.0.2
Summary: mail/postfix: Does not start/build anymore since upgrade to security/openssl ...
Status: Closed FIXED
Alias: None
Product: Ports & Packages
Classification: Unclassified
Component: Individual Port(s) (show other bugs)
Version: Latest
Hardware: amd64 Any
: --- Affects Many People
Assignee: Dirk Meyer
URL:
Keywords: needs-qa, regression
Depends on:
Blocks:
 
Reported: 2015-03-22 10:25 UTC by pvoigt
Modified: 2015-06-14 18:38 UTC (History)
19 users (show)

See Also:


Attachments
scripts to find which dynamic libraries are used by what (13.82 KB, text/x-uuencode)
2015-04-02 11:40 UTC, Martin Birgmeier
no flags Details
perform a dependency sorting of ports given on the command line (3.25 KB, application/x-awk)
2015-04-02 17:29 UTC, Martin Birgmeier
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description pvoigt 2015-03-22 10:25:12 UTC
I am currently using 10.1-RELEASE (amd64) and have just upgraded openssl to version 1.0.2. Now Postfix 2.11.4,1 does not start anymore:

# service postfix onestart
postconf: environment corrupt; missing value for readme_d�
/usr/local/sbin/postconf: fatal: out of memory
postconf: environment corrupt; missing value for queue_di�
/usr/local/sbin/postconf: fatal: out of memory
postconf: environment corrupt; missing value for readme_d�
/usr/local/sbin/postconf: fatal: out of memory
postconf: environment corrupt; missing value for readme_d�
/usr/local/sbin/postconf: fatal: out of memory
postconf: environment corrupt; missing value for readme_d�
/usr/local/sbin/postconf: fatal: out of memory
postfix/postfix-script: fatal: unable to create missing queue directories
postfix/postfix-script: fatal: Postfix integrity check failed!

The same error appears during build:
# portmaster --no-confirm --no-term-title -D -G postfix
...
[src/tlsproxy]
[src/posttls-finger]
/bin/sh postfix-install -non-interactive -package
postconf: environment corrupt; missing value for readme_d�
bin/postconf: fatal: out of memory
postconf: environment corrupt; missing value for readme_d�
bin/postconf: fatal: out of memory
postconf: environment corrupt; missing value for readme_d�
bin/postconf: fatal: out of memory
postconf: environment corrupt; missing value for readme_d�
bin/postconf: fatal: out of memory
postconf: environment corrupt; missing value for readme_d�
bin/postconf: fatal: out of memory
postfix-install: Error: "" should be an absolute path name.
*** Error code 1

Stop.
make[2]: stopped in /usr/ports/mail/postfix/work/postfix-2.11.4
*** Error code 1

Stop.
make[1]: stopped in /usr/ports/mail/postfix
*** Error code 1

Stop.
make: stopped in /usr/ports/mail/postfix

===>>> make stage failed for mail/postfix
===>>> Aborting update


===>>> You can restart from the point of failure with this command line:
  portmaster <flags> mail/postfix

Well, I could rebuild all other ports depending on openssl. I strongly suppose postfix needs a fix.

This issue is mentioned in the forum as well:
https://forums.freebsd.org/threads/postfix-does-not-start-build-anymore-since-upgrade-to-openssl-1-0-2.50959/

Regards,
Peter
Comment 1 Martin Birgmeier 2015-03-22 12:53:42 UTC
I have the same issue with several other programs/ports. For example, "service vboxnet start" hangs, inhibiting a successful boot of the machine.

# pkg which /usr/local/etc/rc.d/vboxnet 
/usr/local/etc/rc.d/vboxnet was installed by package virtualbox-ose-kmod-4.3.26
# 

In fact, for example all virtualbox-ose binaries hang. I had to revert to openssl-1.0.1_19 on all my machines in order to get things running again.

One reason may be that there is a library version conflict between the openssl as shipped with FreeBSD 10 and the one in ports. I have written a script to find the programs affected by such a conflict (i.e., ldd shows that the file loads multiple versions of a library):

conflicts for [libcrypto.so.7, libcrypto.so.8] (37)
        /usr/local/lib/soprano/libsoprano_raptorparser.so
        /usr/local/lib/soprano/libsoprano_raptorserializer.so
        /usr/local/lib/soprano/libsoprano_redlandbackend.so
        /usr/local/lib/libcurl.so.4.3.0
        /usr/local/lib/virtualbox/VBoxAutostart
        /usr/local/lib/virtualbox/VBoxBalloonCtrl
        /usr/local/lib/virtualbox/VBoxExtPackHelperApp
        /usr/local/lib/virtualbox/VBoxManage
        /usr/local/lib/virtualbox/VBoxSVC
        /usr/local/lib/virtualbox/VBoxTestOGL
        /usr/local/lib/virtualbox/VBoxXPCOMIPCD
        /usr/local/lib/virtualbox/vboxwebsrv
        /usr/local/lib/virtualbox/webtest
        /usr/local/lib/libraptor2.so.0.0.0
        /usr/local/lib/librasqal.so.3.0.0
        /usr/local/lib/librdf.so.0.0.0
        /usr/local/lib/libkolabxml.so.1.0.3
        /usr/local/lib/libkolab.so.0.5.3
        /usr/local/lib/libxmlrpc_client++.so.8.36
        /usr/local/lib/libxmlrpc_client.so.3.36
        /usr/local/bin/curl
        /usr/local/bin/rapper
        /usr/local/bin/roqet
        /usr/local/bin/rdfproc
        /usr/local/bin/redland-db-upgrade
        /usr/local/bin/akonadi_kolab_resource
        /usr/local/bin/akonadi_kolabproxy_resource
        /usr/local/bin/rtorrent
        /usr/local/bin/ogg123
        /usr/local/sbin/smbd
        /usr/local/libexec/git-core/git-http-fetch
        /usr/local/libexec/git-core/git-http-push
        /usr/local/libexec/git-core/git-imap-send
        /usr/local/libexec/git-core/git-remote-http
        /usr/local/libexec/git-core/git-remote-https
        /usr/local/libexec/git-core/git-remote-ftp
        /usr/local/libexec/git-core/git-remote-ftps
conflicts for [libssl.so.7, libssl.so.8] (3)
        /usr/local/lib/libxmlrpc_client++.so.8.36
        /usr/local/lib/libxmlrpc_client.so.3.36
        /usr/local/sbin/smbd

It is important to note that the conflicts depend on the order the ports are installed, because a newly installed port may pick up either the system's or the port collection's version of openssl, depending on whether the latter was already installed or not.

I believe that on FreeBSD 10 (and maybe others) it is necessary to completely get rid of the port version of openssl.

-- Martin

p.s. conflicts on another machine (also releng/10.1 amd64; has a somewhat different set of ports installed):

conflicts for [libcrypto.so.7, libcrypto.so.8] (41)
        /usr/local/lib/soprano/libsoprano_raptorparser.so
        /usr/local/lib/soprano/libsoprano_raptorserializer.so
        /usr/local/lib/soprano/libsoprano_redlandbackend.so
        /usr/local/lib/libcurl.so.4.3.0
        /usr/local/lib/libraptor2.so.0.0.0
        /usr/local/lib/librasqal.so.3.0.0
        /usr/local/lib/librdf.so.0.0.0
        /usr/local/lib/gtk-3.0/3.0.0/printbackends/libprintbackend-cups.so
        /usr/local/lib/libreoffice/program/libucpcmis1lo.so
        /usr/local/lib/libreoffice/program/libucpdav1.so
        /usr/local/lib/libreoffice/program/libucpftp1.so
        /usr/local/lib/libreoffice/program/libunordflo.so
        /usr/local/lib/libkolabxml.so.1.0.3
        /usr/local/lib/virtualbox/VBoxAutostart
        /usr/local/lib/virtualbox/VBoxBalloonCtrl
        /usr/local/lib/virtualbox/VBoxExtPackHelperApp
        /usr/local/lib/virtualbox/VBoxManage
        /usr/local/lib/virtualbox/VBoxSVC
        /usr/local/lib/virtualbox/VBoxTestOGL
        /usr/local/lib/virtualbox/VBoxXPCOMIPCD
        /usr/local/lib/virtualbox/vboxwebsrv
        /usr/local/lib/virtualbox/webtest
        /usr/local/lib/libkolab.so.0.5.3
        /usr/local/lib/libcmis-0.5.so.5.0.0
        /usr/local/lib/libcmis-c-0.5.so.5.0.0
        /usr/local/bin/curl
        /usr/local/bin/rapper
        /usr/local/bin/roqet
        /usr/local/bin/rdfproc
        /usr/local/bin/redland-db-upgrade
        /usr/local/bin/akonadi_kolab_resource
        /usr/local/bin/akonadi_kolabproxy_resource
        /usr/local/bin/ogg123
        /usr/local/bin/cmis-client
        /usr/local/libexec/git-core/git-http-fetch
        /usr/local/libexec/git-core/git-http-push
        /usr/local/libexec/git-core/git-imap-send
        /usr/local/libexec/git-core/git-remote-http
        /usr/local/libexec/git-core/git-remote-https
        /usr/local/libexec/git-core/git-remote-ftp
        /usr/local/libexec/git-core/git-remote-ftps
conflicts for [libssl.so.7, libssl.so.8] (41)
        /usr/local/lib/soprano/libsoprano_raptorparser.so
        /usr/local/lib/soprano/libsoprano_raptorserializer.so
        /usr/local/lib/soprano/libsoprano_redlandbackend.so
        /usr/local/lib/libcurl.so.4.3.0
        /usr/local/lib/libraptor2.so.0.0.0
        /usr/local/lib/librasqal.so.3.0.0
        /usr/local/lib/librdf.so.0.0.0
        /usr/local/lib/gtk-3.0/3.0.0/printbackends/libprintbackend-cups.so
        /usr/local/lib/libreoffice/program/libucpcmis1lo.so
        /usr/local/lib/libreoffice/program/libucpdav1.so
        /usr/local/lib/libreoffice/program/libucpftp1.so
        /usr/local/lib/libreoffice/program/libunordflo.so
        /usr/local/lib/libkolabxml.so.1.0.3
        /usr/local/lib/virtualbox/VBoxAutostart
        /usr/local/lib/virtualbox/VBoxBalloonCtrl
        /usr/local/lib/virtualbox/VBoxExtPackHelperApp
        /usr/local/lib/virtualbox/VBoxManage
        /usr/local/lib/virtualbox/VBoxSVC
        /usr/local/lib/virtualbox/VBoxTestOGL
        /usr/local/lib/virtualbox/VBoxXPCOMIPCD
        /usr/local/lib/virtualbox/vboxwebsrv
        /usr/local/lib/virtualbox/webtest
        /usr/local/lib/libkolab.so.0.5.3
        /usr/local/lib/libcmis-0.5.so.5.0.0
        /usr/local/lib/libcmis-c-0.5.so.5.0.0
        /usr/local/bin/curl
        /usr/local/bin/rapper
        /usr/local/bin/roqet
        /usr/local/bin/rdfproc
        /usr/local/bin/redland-db-upgrade
        /usr/local/bin/akonadi_kolab_resource
        /usr/local/bin/akonadi_kolabproxy_resource
        /usr/local/bin/ogg123
        /usr/local/bin/cmis-client
        /usr/local/libexec/git-core/git-http-fetch
        /usr/local/libexec/git-core/git-http-push
        /usr/local/libexec/git-core/git-imap-send
        /usr/local/libexec/git-core/git-remote-http
        /usr/local/libexec/git-core/git-remote-https
        /usr/local/libexec/git-core/git-remote-ftp
        /usr/local/libexec/git-core/git-remote-ftps

and on releng/10.1 i386 (again with a different set of ports installed):

conflicts for [libcrypto.so.7, libcrypto.so.8] (31)
        /usr/local/lib/gtk-3.0/3.0.0/printbackends/libprintbackend-cups.so
        /usr/local/lib/libcupsfilters.so.1.0.0
        /usr/local/sbin/cups-browsed
        /usr/local/sbin/squid
        /usr/local/sbin/squidclient
        /usr/local/libexec/cups/filter/bannertopdf
        /usr/local/libexec/cups/filter/commandtoescpx
        /usr/local/libexec/cups/filter/commandtopclx
        /usr/local/libexec/cups/filter/gstoraster
        /usr/local/libexec/cups/filter/imagetopdf
        /usr/local/libexec/cups/filter/imagetoraster
        /usr/local/libexec/cups/filter/pdftoijs
        /usr/local/libexec/cups/filter/pdftoippprinter
        /usr/local/libexec/cups/filter/pdftoopvp
        /usr/local/libexec/cups/filter/pdftopdf
        /usr/local/libexec/cups/filter/pdftops
        /usr/local/libexec/cups/filter/pdftoraster
        /usr/local/libexec/cups/filter/rastertoescpx
        /usr/local/libexec/cups/filter/rastertopclx
        /usr/local/libexec/cups/filter/rastertopdf
        /usr/local/libexec/cups/filter/texttopdf
        /usr/local/libexec/cups/backend/parallel
        /usr/local/libexec/cups/backend/serial
        /usr/local/libexec/git-core/git-http-fetch
        /usr/local/libexec/git-core/git-http-push
        /usr/local/libexec/git-core/git-imap-send
        /usr/local/libexec/git-core/git-remote-http
        /usr/local/libexec/git-core/git-remote-https
        /usr/local/libexec/git-core/git-remote-ftp
        /usr/local/libexec/git-core/git-remote-ftps
        /usr/local/libexec/squid/cachemgr.cgi
conflicts for [libssl.so.7, libssl.so.8] (22)
        /usr/local/lib/gtk-3.0/3.0.0/printbackends/libprintbackend-cups.so
        /usr/local/lib/libcupsfilters.so.1.0.0
        /usr/local/sbin/cups-browsed
        /usr/local/libexec/cups/filter/bannertopdf
        /usr/local/libexec/cups/filter/commandtoescpx
        /usr/local/libexec/cups/filter/commandtopclx
        /usr/local/libexec/cups/filter/gstoraster
        /usr/local/libexec/cups/filter/imagetopdf
        /usr/local/libexec/cups/filter/imagetoraster
        /usr/local/libexec/cups/filter/pdftoijs
        /usr/local/libexec/cups/filter/pdftoippprinter
        /usr/local/libexec/cups/filter/pdftoopvp
        /usr/local/libexec/cups/filter/pdftopdf
        /usr/local/libexec/cups/filter/pdftops
        /usr/local/libexec/cups/filter/pdftoraster
        /usr/local/libexec/cups/filter/rastertoescpx
        /usr/local/libexec/cups/filter/rastertopclx
        /usr/local/libexec/cups/filter/rastertopdf
        /usr/local/libexec/cups/filter/texttopdf
        /usr/local/libexec/cups/backend/parallel
        /usr/local/libexec/cups/backend/serial
        /usr/local/libexec/git-core/git-imap-send
Comment 2 pvoigt 2015-03-22 15:01:21 UTC
Things got worse for me as well. Several programs and services started to core dump. System was left completely unstable. Luckily, I had I back from yesterday afternoon und I immediately restored it. I just lost a couple of emails but my sytem is stable again.

I have described some of these problems in the forum (see link above).

So I suscpect openssl is the problem and needs a fix.

Regards,
Peter
Comment 3 Martin Birgmeier 2015-03-22 15:35:03 UTC
Yes, I also had coredumps of several programs. Curl (on releng/10.1 i386, but not amd64) for example.

Others just hang, as I have described before.

-- Martin
Comment 4 Bryan Drewery freebsd_committer 2015-03-23 21:54:49 UTC
Are you using ldap in /etc/nsswitch.conf?
Comment 5 Bryan Drewery freebsd_committer 2015-03-23 21:55:08 UTC
Or Ldap with postfix?
Comment 6 Matthew Rezny freebsd_committer 2015-03-23 22:36:37 UTC
Same problem with VirtualBox on the two machine I put 1.0.2a on, resolved by revert to 1.0.1m.

vboxnet service hangs when the machine starts but eventually exits after 10+ min wait, or immediately with ctrl-c in case the machine is local.

Vbox binaries don't hang forever, but long enough to be unusable. Trying to start VirtualBox GUI eventually results in an error popup. Starting a second instance gets a GUI quicker, but then trying to start any VM fails with an error suggesting the COM server is dead.

VBoxManage hangs on 'list vms' but can show hostinfo, so it depends on some specific funtion getting stuck, there are probably other safe parameters. On startup, loading the two kernel modules succeeds, it is the call to VBioxManage to list the local interfaces that hangs. That's just status, it could be commented out, but that is the first claer sign of trouble.

All fixed by reverting OpenSSL.
Comment 7 pvoigt 2015-03-23 23:17:01 UTC
(In reply to Bryan Drewery from comment #5)

Yes, I am using postfix authenticating against openldap.

And yes, I am also using ldap in /etc/nsswitch.conf.

I have pointed out the assumed relationship of this issue with openldap support in the forum https://forums.freebsd.org/threads/postfix-does-not-start-build-anymore-since-upgrade-to-openssl-1-0-2.50959/ and it is discussed in the list as well http://lists.freebsd.org/pipermail/freebsd-ports/2015-March/098500.html (archive seems incomplete).

Regards,
Peter
Comment 8 Dirk Meyer freebsd_committer 2015-03-24 06:59:00 UTC
You can not mix base openssl and port.

Please rebuild all ports using openssl,
so then use the new API and link to the port only.
Comment 9 pvoigt 2015-03-24 09:01:57 UTC
(In reply to Dirk Meyer from comment #8)

I don't know, if you are refering to me. I assume yes.

I did no manual interaction to force any of my ports building against base openssl. I am not using any switch in /etc/make.conf.

And if some of my ports should have been build against base openssl: There was definitely no problem with it right before upgrading to openssl 1.0.2.

Nevertheless, I am obviously not the only one not knowing how to safely check all ports, if they are sanely built against port openssl. Could you please give a brief summary on how to do such a check? The only anoying way I currently know is to list reverse dependencies of openssl and to check manually each binary with ldd.

Just to make things clear: I had to restore a root filesystem dump because of the system instability and I am back on openssl-1.0.1_19.

Regard,
Peter
Comment 10 Matt Smith 2015-03-24 12:40:00 UTC
I have similar issues which have caused me to downgrade back to 1.0.1 affecting other software. I recompiled all ports which link against openssl and it made no difference, and as far as I can tell from ldd they are only linked against the port version of openssl and not the base. I have nothing special in make.conf to force this. What I have found is that PHP-FPM races and causes hundreds of processes where the load average goes up to 50+ and increasing until it's killed. And some shell scripts stop working. If I run a shell script with /bin/sh as the interpreter it says that the environment is corrupt and USERNAME doesn't exist. Other people on the mailing lists have complained that things like /bin/vi also say environment corrupt. All this is weird as neither vi, sh, or PHP-FPM link to openssl. It's maybe like some memory is being overwritten? If you download to 1.0.1 then all the problems go away. I'm on 10.1-STABLE r280277 amd64.
Comment 11 Matthew Rezny freebsd_committer 2015-03-24 14:21:29 UTC
Each time I changed version of security/openssl I ran portmaster -r openssl to upgrade everything using it. I have WITH_OPENSSL_PORTS=yes so all ports should be using the port openssl and not base openssl. Still, all of base is built using base openssl, so mixing ports and base is necessarily mixing port openssl and base openssl. Having those be incompatible versions wrecks havock.

P.S. No LDAP in my nsswitch.
Comment 12 Matt Smith 2015-03-24 15:35:18 UTC
I should also mention, no LDAP configuration at all on my server. My nsswitch.conf is stock as it came when it was a fresh installation.
Comment 13 Matt Smith 2015-03-24 21:31:06 UTC
Someone on the forums suggested disabling the ASM optimized assembler code option in the settings. This seems to have solved the problem for me. Everything is working now. I have an Intel Atom D525. I guess there is some dodgy interaction between that and openssl 1.0.2 with that option enabled. That option worked fine on 1.0.1.
Comment 14 Bryan Drewery freebsd_committer 2015-03-24 23:35:35 UTC
Can someone please post a build log of security/openssl with ASM enabled?
Comment 15 Matt Smith 2015-03-25 08:11:06 UTC
1) https://www.dropbox.com/s/hmett869on9pq0k/openssl2.txt?dl=0

2) https://www.dropbox.com/s/ydcj0zzuihnt2lt/openssl3.txt?dl=0

1 is with ASM enabled, 2 is with ASM disabled.

When 1 is installed weird things like this happen:

# ./start.sh
sh: environment corrupt; missing value for USERNAME

And things like the original poster posted about postfix.

When I install 2 everything works perfectly fine. I was thinking about trying to compile it using gcc rather than clang? Might try that in a bit.
Comment 16 Matt Smith 2015-03-25 11:44:48 UTC
I just tried recompiling it using gcc48 with ASM enabled and the same fault happens. So it's not a clang specific issue. It is just related to ASM.

FYI, someone else on the forum thread just posted this as well, so another application that doesn't like it. They said with ASM disabled it works fine.

root@daemon # minidlnad -R -u dlna -f /usr/local/etc/minidlna.conf
sh: environment corrupt; missing value for BLOCKSIZ?
Comment 17 Mariusz J. Handke 2015-03-25 15:38:47 UTC
Currently on:

FreeBSD alice 10.0-RELEASE-p10 FreeBSD 10.0-RELEASE-p10 #0: Mon Oct 20 12:42:25 UTC 2014     root@amd64-builder.daemonology.net:/usr/obj/usr/src/sys/GENERIC  amd64


Had the same issue, though with other ports, i.e. Dovecot, had to downgrade OpenSSL


Mar 24 23:38:36 imap-login: Error: imap-login: environment corrupt; missing value for LOG_TO_M
Mar 24 23:38:36 imap-login: Fatal: putenv(RESTRICT_USER=) failed: Bad address

[..]

Mar 25 11:32:13 auth(default): Error: dovecot-auth: environment corrupt; missing value for LOG_TO_M
Mar 25 11:32:13 auth(default): Error: dovecot-auth: Fatal: putenv(RESTRICT_USER=) failed: Bad address
Mar 25 11:32:13 dovecot: Error: child 95110 (auth) returned error 89 (Fatal failure)
Mar 25 11:32:13 dovecot: Fatal: Auth process died too early - shutting down

[..]

Mar 25 12:04:28 deliver(***user***): Error: userdb lookup: connect(/var/run/dovecot/auth-master) failed: Connection refused
Comment 18 Mariusz J. Handke 2015-03-25 15:42:21 UTC
(In reply to Mariusz J. Handke from comment #17)

Forgotten to mention that even with ASM disabled having OpenSSL 1.0.2a did not make any difference in my case with Dovecot
Comment 19 Dirk Meyer freebsd_committer 2015-03-26 06:11:35 UTC
Summary:

We have two different issues:

Issue A: amd64, programs that link in openssl port and openssl base.

Fix 1: rebuild all components with openssl port

or:

Fix 2: use only openssl base.


Issue B: i386, ASM code seems do have problems on some CPU.

Fix 1:  disable option ASM


I can not reproduce the Issue B on FreeBSD 8.4 i386.
Comment 20 pvoigt 2015-03-26 10:34:39 UTC
(In reply to Dirk Meyer from comment #19)

Hm, I feel a bit lost with your proposed "solution". I am runnning amd64.

And I am finally left with more open than solved questions. Let me give a short summary:

1.) All my ports are working rock stable with openssl 1.0.1_19.

2.) I have never explicitely used USE_OPENSSL=yes or WITH_OPENSSL_PORT=yes
    in /etc/make.conf.

3.) Pure upgrade to openssl 1.0.2 just worked without issues.

4.) Servere problems started on my machine when rebuilding all ports depending
    on openssl. Thes problems even prevented me from re-compiling all those 
    ports due to numerous core dumps and errors. Even sh and vi core dumped 
    making my system more ore less unresponsive. These observations are 
    described my a lot of people. I had to restore my root file system.

At least I need a reliable way to find out:

1.) which ports depend on openssl port. Is "pkg info -r openssl" the right way
    or do I have to investigate each binary with ldd?

2.) which ports accidently depend on base openssl.

3.) how to avoid individual ports from building against base port again this 
    time and in the future.

The answers to these questions should be added to UPDATING.

If the answers can not be given or if they cannot ensure a stable system:
Shouldn't FreeBSD better stay with openssl 1.0.1 for some time?

Regards,
Peter
Comment 21 Matt Smith 2015-03-26 10:47:24 UTC
I personally did a portmaster -r openssl to rebuild all ports that depended on openssl followed by restarting everything in /usr/local/etc/rc.d

What I am concerned about is the potential performance impact of disabling ASM? Does anyone know the relative penalty for having this switched off. I'd like to try and work towards a solution that allows me to reenable this option really as I guess it's there for a reason!
Comment 22 pvoigt 2015-03-26 11:43:10 UTC
(In reply to Matt Smith from comment #21)

Just to make sure: Are you on i386 or on amd64?

If I read Dirk Meyer's post correctly, disabling option ASM is a solution for i386 only.
Comment 23 Matt Smith 2015-03-26 11:50:21 UTC
(In reply to pvoigt from comment #22)

I'm on amd64, 10.1-STABLE r280277. So that isn't the case.
Comment 24 pvoigt 2015-03-26 12:03:20 UTC
(In reply to Matt Smith from comment #23)

Thanks for quick reply. So disabling ASM seems a fix for amd64, too.

To be honest I tend to rebuild all ports against base openssl and finally delete port openssl. But I don't know, if all currently installed ports do carefully honor the base openssl switch "USE_OPENSSL=yes" in /etc/make.conf.

Unfortunately, I am quite busy right now, having not enough time to regularly observe my server for possible problems. So I have to delay my possible tests/fixes for some days.
Comment 25 pvoigt 2015-03-26 14:41:20 UTC
Updated URL of the related issue as discussed in freebsd-stable:
http://lists.freebsd.org/pipermail/freebsd-stable/2015-March/082044.html
Comment 26 Dirk Meyer freebsd_committer 2015-03-29 18:45:18 UTC
(In reply to pvoigt from comment #24)

... "USE_OPENSSL=yes" in /etc/make.conf.

is wrong .... Please do not do this.


you can send in /etc/make.conf:

WITH_OPENSSL_PORT=yes

or

WITH_OPENSSL_BASE=yes
Comment 27 pvoigt 2015-03-29 19:08:37 UTC
(In reply to Dirk Meyer from comment #26)

Thanks, Dirk. I have realized my typo right after posting. However, there are two questions left, that I cannot get clarified with searching the FreeBSD forum:

1.) Is WITH_OPENSSL_PORT=yes the default setting, if nothing is specified in
    /etc/make.conf? Currently I have nothing specified in my /etc/make.conf.

2.) I have tried to build an indiviual port with WITH_OPENSSL_BASE=yes.
    In my test I have used security/stunnel. However, it refuses to use base 
    openssl und auto-selected port openssl.

    Does it mean that any port refuses to build against base openssl as long as it
    is installed? This would be annoying.

    Should every port build against base openssl as long as WITH_OPENSSL_BASE=yes
    is used?

Regards,
Peter
Comment 28 Dirk Meyer freebsd_committer 2015-04-02 05:50:32 UTC
(In reply to pvoigt from comment #27)

1.) Is WITH_OPENSSL_PORT=yes the default setting, if nothing is specified in
    /etc/make.conf? 

The default is to use OpenSSL from base.


2.) I have tried to build an indiviual port with WITH_OPENSSL_BASE=yes.
    In my test I have used security/stunnel. However, it refuses to use base 
    openssl und auto-selected port openssl.

    Does it mean that any port refuses to build against base openssl as long as it
    is installed? This would be annoying.

Yes, the linker will always prefer the higher shared lib version present at build time and link in the OpenSSL lib from ports.

The Compiler might also pick up the API from ports that is present in the include files, which can cause runtime problems due the size of structures does not match, e.G. crashes and memory corruption.

Summary: You can not build anything that use OpenSSL from base,
while the OpenSSL port is installed.

But clean build ports with OpenSSL from base can coexist with new build ports that link with OpenSSL from ports.


3.) Should every port build against base openssl as long as WITH_OPENSSL_BASE=yes
    is used?

Yes, unless the port itself force the OpenSSL port.


Unsolved Problems:

ports that link against OpenSSL port and also link against base libs which itself use OpenSSL from base.

e.g. libfetch, pam_ldap ...

This ports must be build an a clean system or jail and link to OpenSSL from base.
Comment 29 Martin Birgmeier 2015-04-02 08:13:55 UTC
My experience has been the following:

1) Initially, ports build using openssl from base. This results in a set of ports "A" using openssl from base.

2) There is at least one port which explicitly overrides this (if none of the WITH_OPENSSL_{BASE,PORT} flags is set) and pulls in openssl from ports. In my case, looking at the Makefiles of the ports I have installed, at least dns/bind910, net/freeradius3, and net/libsrtp are potential culprits.

2a) Once openssl from ports exists, other ports use this. This results in a set of ports "B" using openssl from ports.

3) There are ports which need libraries from both sets "A" and "B", and voila, a library conflict ensues for those.

I have found out that explicitly setting WITH_OPENSSL_BASE=yes before building any port suffices to keep the ports' version of openssl out (probably because this setting overrides the one in the port's Makefile). What I did is write a script which outputs conflicting libs for all executables/shared libraries (see comment #1) and recompile (in dependency order) all packages containing said exes/libs.

I think it is necessary to clean up ports to not specify WITH_OPENSSL_PORTS, and also bsd.openssl.mk to not even offer the choice on systems which include openssl in base (visibly, i.e., not in lib/private).

-- Martin

p.s. Remember "DLL hell"?
Comment 30 pvoigt 2015-04-02 10:59:26 UTC
Thanks, Dirk and Martin. Your comments shed some light on the "openssl library hell".

I am still on port openssl 1.0.1_19 due to lack of time. I did not even find time to install port openssl 1.0.2 with ASM=off. And tomorrow I will be on holiday for one week.

There obviously is an issue with ASM=on on some hardware. One the other hand I am not quite sure, if base openssl and port openssl (1.0.2) can sanely coexists at this time at all.

According to Martin it should be possible to build all ports against base openssl, if you previously remove port openssl. More excactly: Before first port is build against openssl. Even if you create a list with ports depending on openssl with
# pkg info -r openssl

and subsequently delete the currently installed port openssl:
# pkg delete -f <current_port_openssl>

and adjust WITH_OPENSSL_BASE=yes in /etc/make.conf

any subsequent rebuild of a port will grab port openssl again, because dependencies still point to port openssl. Therefore I see no chance to rebuild all ports on a running system against base openssl.

Could anybody please correct or confirm this?

Martin, if I remember correctly, libraries of openssl are not compatible if they have different version numbers. If any program should accidently grab port openssl 1.0.2, it is assumed to fail. Therefore I have been extremly surprised that port openssl was suddenly upgraded from 1.0.1 to 1.0.2 without even any comment in UPGRADING. Before port openssl version 1.0.2 this was no issue, because port and base openssl have been version compatible.

Martin, did I understand you correctly that there are ports linking both against base and port openssl? This looks strange to me. I am not an expert with such things but have until now the following simple picture: On program start openssl libs are searched in the order base and port openssl and first match wins. But linking should be done against exactly on library version.

Martin, could you please provide you script investigating the openssl dependencies? I would like to run it on my server.

Regards,
Peter
Comment 31 Martin Birgmeier 2015-04-02 11:32:55 UTC
What to do:
-----------

1) Set WITH_OPENSSL_BASE=yes in /etc/make.conf

2) Run the scripts as follows (you might want to adjust the exclusion list in ldd.scan; also, make sure that gawk is installed):

    ( ldd.scan > /tmp/x4 2> /tmp/x5 && ldd.genstats < /tmp/x4 > /tmp/x6 ) &

3) You can the look at /tmp/x6 using "less", and search for the following pattern:

    "conflicts for " (without the quotes)

4) Using your favorite editor/tools, cut the executables/libraries which are shown below the conflicting libraries you are interested in, and store these lines into a file called "/tmp/x7".

5) Run

    pkg which -q `cat /tmp/x7` | sort -u > /tmp/x8

This will give you the list of ports which need to be recompiled.

6) Forcibly remove the port openssl:

    pkg delete -f openssl-...

Note: After this, programs using the port openssl won't be able to run any more. It is best to do this from a console command prompt. On the other hand, programs already running will continue to do so, so a server will be able to continue serving if it does not need to fork.

7) Somehow determine the dependency order of the ports in /tmp/x8, and rebuild all of them according to that order.

Answers to your questions:
--------------------------

Rebuild on a running system: Yes, but see the caveats in 6) above.

"... libraries of openssl are not compatible...": I have no idea whether they are "compatible" or not. But strictly speaking, mixing installations in such a way can lead to all kinds of errors, depending on how symbols are resolved by the runtime linker, which dynamic data structures are created/accessed/deleted by each of the versions, etc., even if exactly the same version were to be installed twice.

"... there are ports linking both against base and port openssl...": Yes of course. See 3) above. For each of the exes found, you may check again using "ldd".

Scripts follow.

-- Martin
Comment 32 Martin Birgmeier 2015-04-02 11:40:36 UTC
Created attachment 155121 [details]
scripts to find which dynamic libraries are used by what

These are two scripts I use to generate the following relationships:

dynlib -> list of exes/dynlibs using it

[various versions of a dynlib] -> list of exes/dynlibs using them (= library conflict)

They are somewhat stupid but should not really it your machine; specifically, dynlibs only known locally to a program will be shown with an "XXX" prepended.

Searching the output of ldd.genstats using the pattern

    "compat\/pkg|XXX|conflicts for "

usually leads to interesting insights (for me :-)).

gawk needs to be installed.

The exclusions in ldd.scan should be edited to taste.

-- Martin
Comment 33 Martin Birgmeier 2015-04-02 11:41:41 UTC
"it" -> "eat"
Comment 34 pvoigt 2015-04-02 16:44:03 UTC
Martin, thanks for your explanations and your script. In particular thank your for clarifying #comment 31, item 6: I made an error, because after a forced deletion of port openssl index file is updated, e.g. it just shows a broken dependency and not - like I originally assumed - the previous dependency to port openssl. This gives a chance to rebuild against base openssl. Due to lack of time I will have to delay further investigations by a couple of days.

I would like to share/discuss following ideas and questions:

1.) Did you rebuild in the meantime all of your ports against base openssl?

2.) Refering to your #comment 31, item 7:

    Given I have a list of ports to be rebuilt against base openssl in a file
    called "filename": Shouldn't it be enough to feed them e.g. to portmaster
    in a way like:

    for i in `cat $fileName`
    do
      portmaster $i
    done

    e.g. portmaster should figure out the right order the ports have to be
    rebuilt.

3.) Refering to your #comment 31: If I intend to rebuild all ports depending
    on openssl anyway to rebuild them against base openssl: Isn't it enough to 
    find them out with

    pkg info -r openssl

    instead of using your script?

Regards,
Peter
Comment 35 Martin Birgmeier 2015-04-02 17:15:29 UTC
1) Yes.

2) Your command will simply recompile the ports in the order given in $filename, which may or may not be in dependency order. What you might have meant is to issue

    portmaster `cat $filename`

but unfortunately even this does not do any dependency checking if ports are already up-to-date (i.e., are going to only be recompiled). To see that this is so, you can try

    portmaster 'f*'

which will happily recompile all ports starting with "f" in alphabetical order... This is actually a gripe I have with portmaster; it should always do a dependency sorting of the ports it is going to work on.

3) My experience has been "no", because it seems that ports link to the ports' openssl without this being explicitly noted in the Makefile as a dependency, and then pkg does not now about it. Conversely, using the recorded dependency information leads to a lot more ports seemingly depending on the ports' openssl when in reality no executable/library is really affected. The drawback of my method is that 'pkg check -da' complains about unresolved dependencies on openssl for the ports which have not been recompiled - but this can safely be ignored.

You might try your method first, then check the result with my scripts.

-- Martin
Comment 36 Martin Birgmeier 2015-04-02 17:29:43 UTC
Created attachment 155125 [details]
perform a dependency sorting of ports given on the command line

This script performs a dependency sorting of the ports given on the command line. It assumes that /usr/ports holds the complete ports tree.

As an example, it may be invoked as

    get_port_order `pkg query -a %o`

to get the dependency relationships for all currently installed ports (after sorting the first part of its output numerically on the second field, tab-delimited).

The advantage over "pkg" is that it reflects the current situation and not the whole history of dependencies as recorded by "pkg".

Notes:
- Requires gawk.
- Fishes out values of internal variables of the ports Mk infrastructure, so is prone to break in the future (but in fact hasn't for quite some time now).
- Contains at least one local hack which should be harmless on other installations.
- Should not really eat your computer. ;-)

-- Martin

p.s. portmaster should actually be doing something like this first, then compare it to what has been recorded by "pkg", and then also recompile ports whose dependency information has changed... in addition to always compiling ports in dependency order.
Comment 37 Yoshisato Yanagisawa 2015-04-03 16:58:24 UTC
FYI, tldr ABI change on OPENSSL_ia32_cpuid might be direct cause of this issue.

I also see the same issue but I am experiencing more sad.  vi crashes during the editing.  This issue occurred after I installs openssl 1.0.2 port on FreeBSD 10.1R.

According to gdb, it says OPENSSL_ia32_cpuid caused the crash.
> gdb vi
<snip>
> (gdb) run
<edit something until crash>
(gdb) backtrace
#0  0x0000000802ff39e5 in OPENSSL_ia32_cpuid ()
   from /usr/local/lib/libcrypto.so.8
#1  0x00000008039f70b9 in OPENSSL_ia32cap_loc () from /lib/libcrypto.so.7
#2  0x00000008038fd84e in _init () from /lib/libcrypto.so.7
#3  0x00007fffffffca40 in ?? ()
#4  0x00000008006686bf in r_debug_state () from /libexec/ld-elf.so.1
#5  0x000000080066cd87 in _rtld_get_stack_prot () from /libexec/ld-elf.so.1
#6  0x0000000800669ad3 in dlopen () from /libexec/ld-elf.so.1
#7  0x0000000800dfd436 in _nsdbtaddsrc () from /lib/libc.so.7
#8  0x0000000800df73c9 in _nsyyparse () from /lib/libc.so.7
#9  0x0000000800dfdab1 in nsdispatch () from /lib/libc.so.7
#10 0x0000000800deaebe in getpwuid () from /lib/libc.so.7
#11 0x0000000800deacbf in getpwnam () from /lib/libc.so.7
<snip>
(gdb) disass
Dump of assembler code for function OPENSSL_ia32_cpuid:
0x0000000802ff39e0 <OPENSSL_ia32_cpuid+0>:      mov    %rbx,%r8
0x0000000802ff39e3 <OPENSSL_ia32_cpuid+3>:      xor    %eax,%eax
0x0000000802ff39e5 <OPENSSL_ia32_cpuid+5>:      mov    %eax,0x8(%rdi)
0x0000000802ff39e8 <OPENSSL_ia32_cpuid+8>:      cpuid
0x0000000802ff39ea <OPENSSL_ia32_cpuid+10>:     mov    %eax,%r11d
<snip>

Address 0x802ff39e5 is "mov    %eax,0x8(%rdi)".
I compared crypto/x86_64cpuid.pl between 1.0.1l and 1.0.2. and they are different.  Difference should be before and after following change?
https://github.com/openssl/openssl/commit/c5cd28bd64fa2b02f29e74486539e4b2f6741114
With the change, %rdi points OPENSSL_ia32cap_P.


However, is it natural to use libcrypto.so.8's OPENSSL_ia32_cpuid from /lib/libcrypto.so.7?
To tell the truth, I saw the same issue on postfix and I have tried "Comment 19 Fix 1: rebuild all components with openssl port".  I continue seeing the issue.
I feel a dynamic loader misbehaves.
Comment 38 Yoshisato Yanagisawa 2015-04-04 02:44:21 UTC
Just update on Comment 37 "However, is it natural to use libcrypto.so.8's OPENSSL_ia32_cpuid from /lib/libcrypto.so.7?"
I am using nss_ldap, and nss-related function call should cause mixture of libcrypto.so.7 and libcrypto.so.8.
If I install libraries called from base libraries, I might not be able to use openssl port.  It seems to be told in Comment 28.  I should have noticed it.
Comment 39 Bernard Spil freebsd_committer 2015-04-04 09:48:36 UTC
You may be interested at looking at https://bugs.freebsd.org/195796 it contains lists of ports that link base openssl libs https://wiki.freebsd.org/OpenSSL even when WITH_OPENSSL_PORT is set.

https://wiki.freebsd.org/LibreSSL contains reference to work done to make all ports work with LibreSSL (WIP), most things work with non-base OpenSSL but there are caveats!

To isolate builds (i.e. make sure they can't link base OpenSSL) please look at 
http://bsdxbsdx.blogspot.nl/2015/04/build-packages-in-poudriere-without.html

I've created a wiki page for 1.0.2 specifically to capture issues and fixes https://wiki.freebsd.org/OpenSSL/1.0.2
Comment 40 Bernard Spil freebsd_committer 2015-04-04 10:57:32 UTC
Hi Peter, 

Just built it on my 10.1 amd64 host and not seeing an issue. None of the staged files of postfix link to multiple ssl libs with the standard options.

There are some issues with this port in relation to Ports OpenSSL libs.
SASL_KRB5 will likely break as that relies on kerberos from base (which in turn links base OpenSSL libs). Potentially other pitfalls exist with options that pull in base libs...

This is one of these cases where you need to be very cautious, replacing base SSL libs must not be taken lightly!
Comment 41 M. Macha 2015-04-07 09:44:19 UTC
Hi,

we had to disable ASM on all our x86_64 VMs (VMware) to compile Postfix against openssl 1.0.2.

rebuild via "portmaster -r openssl" runs fine after all.

Regards,

Matthias
Comment 42 Bernard Spil freebsd_committer 2015-04-10 19:29:32 UTC
(In reply to Martin Birgmeier from comment #1)
Martin, please check with readelf -d. Many of the conflicts you list are because of libcurl...

E.g. readelf -d /usr/local/libexec/git-core/git-http-fetch returns a dep on Shared library: [libcrypto.so.32] here and Shared library: [libcurl.so.4]
If you have curl compiled with base OpenSSL you'll get what you're seeing.
Comment 43 Johan Ström 2015-04-10 20:30:59 UTC
For what it's worth:
I've been building using WITH_OPENSSL_PORT=YES in my poudriere setup since I started to build ports.
After installing fresh versions of a few ports, net/asterisk11 failed to start crash in OPENSSL_ia32_cpuid.

I suspected curl and tried to rebuild with GSSAPI_NONE instead of GSSAPI_BASE (new option since earlier build). This did not help (full build list: CA_BUNDLE COOKIES DOCS EXAMPLES IPV6 PROXY TLS_SRP GSSAPI_NONE THREADED_RESOLVER OPENSSL).

Asterisk built with CURL FREETDS GSM PGSQL RADIUS SNMP SQLITE SRTP UUID VORBIS XMPP.

Did not dig deeper in asterisk's deps, but instead tried to build openssl port without ASM option. This solved the asterisk problem immediately.
Comment 44 ari 2015-04-11 13:19:57 UTC
Here's another similar issue:

    http://markmail.org/thread/vepgng6krl4cqckz

The symptom is crashing in vi (from base system) and bash (from ports) with a backtrace like this:

    #0  0x00000008029cafe5 in OPENSSL_ia32_cpuid () from /usr/local/lib/libcrypto.so.8
    #1  0x00000008033cf0b9 in OPENSSL_ia32cap_loc () from /lib/libcrypto.so.7
    #2  0x00000008032d584e in _init () from /lib/libcrypto.so.7

Turns out that the conflict between base and ports openssl is tied to nss_ldap and the ldap client libraries.

Compiling everything against base openssl and completely removing the openssl port fixed this for me. Hopefully this helps someone else googling these symptoms.
Comment 45 pvoigt 2015-04-17 21:27:00 UTC
Back from holiday I have found all your helpful posts on this issue - thank you very much.

Today I have finally found some time to rebuild all ports against base openssl. I have determined all port that require a rebuild using pkg. I manually selected the order of ports to rebuild and restarted a service once it has been rebuilt.

Just to be one the safe side I rebooted my server after all necessary port have been rebuilt. I hope that I did not select a too naive way. But I am confident as 
my server is running stable for several hours now while beeing completely freed from port openssl.

@ Martin Birgmeier: I did not (yet) use your provided scripts. After a quick scan of the sources, I found that I have not enough awk knowledge to understand them as far as I would like to before running them and decided to try a ports rebuild without your scripts.
Comment 46 Dirk Meyer freebsd_committer 2015-04-25 09:41:14 UTC
The port can not undo the API changes from OpenSSL.


The build system inside OpenSSL is not stable,
the assembler options might me re-enabled with the next bugfix release.
Comment 47 Yuri Victorovich freebsd_committer 2015-05-11 22:45:34 UTC
If new version of OpenSSL port has incompatible API changes, then the port for the old version should created (with numbers in its version). And dependent packages should link with this older version.
Comment 48 Bryan Drewery freebsd_committer 2015-05-11 23:15:11 UTC
Bug 199352 was not related.
Comment 49 Matt Smith 2015-06-14 18:38:53 UTC
FYI. Regarding the issue where compiling with ASM enabled on some hardware causes massive issues with software like postfix, vim, PHP etc when it corrupts the environment. Since the release of 1.0.2c I decided to attempt it with ASM enabled again to see what happens. This time it worked fine. No issues at all seen. So it could either be something that's changed between 1.0.2a and 1.0.2c or something has changed in 10.1-STABLE which has worked around this as I've also upgraded that several times since. Either way, it's working fine now.