Ever since I upgraded my LDAP servers to 2.4, *all* of them have some classes of problems related to LDAP and NSS. For example, during bootup, some assertions trigger (these are gone, after the system has finished boot-up) <dmesg> Starting privoxy. Assertion failed: (r != NULL), function ldap_parse_result, file error.c, line 272. pid 1261 (csh), uid 201: exited on signal 6 (core dumped) It is *always* privoxy, that is effected. When I was still running dbus/hald/policykit, they would crash on boot up too. Once I've logged in, I can restart the services just fine. But logging in is not working for 60-90 seconds after the getty prompt appears. I enter my username, then it hangs for several seconds (20-30) and drops me back to login with an LDAP error. The third try usually is the charm ... One very annoying thing is, that I continually get errors like this: Apr 14 13:43:05 roadrunner sudo: nss_ldap: could not search LDAP server - Server is unavailable Apr 14 13:43:05 roadrunner sudo: nss_ldap: could not search LDAP server - Server is unavailable Apr 14 13:43:33 roadrunner xterm: nss_ldap: could not search LDAP server - Server is unavailable Apr 14 13:43:34 roadrunner xterm: nss_ldap: could not search LDAP server - Server is unavailable Apr 14 13:47:37 roadrunner sudo: nss_ldap: could not search LDAP server - Server is unavailable Apr 14 13:47:40 roadrunner xterm: nss_ldap: could not search LDAP server - Server is unavailable Apr 14 13:47:41 roadrunner xterm: nss_ldap: could not search LDAP server - Server is unavailable Please note, that LDAP and NSS are set up correctly and they *work*, the message above is totally bogus! Another weird thing that has started right around when I switched to OpenLDAP 2.4 is the groups for my user are gone, when under X. Running id(1) on the console lists all the groups I'm a member of. Running id(1) inside an xterm I get *no* secondary groups. This is also true, when logging in via ssh. getent(1) on the other hand works fine. How-To-Repeat: Upgrade your LDAP client installation from OpenLDAP 2.3 to 2.4. Rebuild nss_ldap and pam_ldap ports.
Responsible Changed From-To: freebsd-ports-bugs->delphij Over to maintainer.
New data point: Everything seems to work fine, when using ldapi:// URLs (which is not possible, during system startup, though). If nss_ldap.conf contains: uri ldaps://roadrunner.spoerlein.net/ ldaps://coyote.spoerlein.net/ ldaps://vpn.spoerlein.net Then id(1) under xterm produces: % id May 5 19:24:20 roadrunner xterm: nss_ldap: could not search LDAP server - Server is unavailable uid=1000(uqs) gid=1000(uqs) groups=1000(uqs) If I change the URIs to uri ldapi://%2fvar%2frun%2fopenldap%2fldapi/ ldaps://roadrunner.spoerlein.net/ ldaps://coyote.spoerlein.net/ ldaps://vpn.spoerlein.net I get % id uid=1000(uqs) gid=1000(uqs) groups=1000(uqs),999(users),80(www),0(wheel),5(operator),69(network),68(dialer),13(games) It breaks even more, if I turn on start_tls and use ldapi:// + ldap:// URLs instead. ldapsearch(1) can cope with this setting, but nss_ldap simply barfs. nss_ldap via id(1) May 5 19:26:11 roadrunner slapd[1249]: conn=721 fd=57 ACCEPT from PATH=/var/run/openldap/ldapi (PATH=/var/run/openldap/ldapi) May 5 19:26:11 roadrunner slapd[1249]: conn=721 op=0 EXT oid=1.3.6.1.4.1.1466.20037 May 5 19:26:11 roadrunner slapd[1249]: conn=721 op=0 STARTTLS May 5 19:26:11 roadrunner slapd[1249]: conn=721 op=0 RESULT oid= err=0 text= May 5 19:26:11 roadrunner slapd[1249]: conn=721 fd=57 TLS established tls_ssf=256 ssf=256 May 5 19:26:11 roadrunner slapd[1249]: conn=721 op=1 UNBIND ldapsearch -Lx -H ldapi://%2fvar%2frun%2fopenldap%2fldapi/ -Z May 5 19:33:11 roadrunner slapd[1249]: conn=819 fd=57 ACCEPT from PATH=/var/run/openldap/ldapi (PATH=/var/run/openldap/ldapi) May 5 19:33:11 roadrunner slapd[1249]: conn=819 op=0 EXT oid=1.3.6.1.4.1.1466.20037 May 5 19:33:11 roadrunner slapd[1249]: conn=819 op=0 STARTTLS May 5 19:33:11 roadrunner slapd[1249]: conn=819 op=0 RESULT oid= err=0 text= May 5 19:33:11 roadrunner slapd[1249]: conn=819 fd=57 TLS established tls_ssf=256 ssf=256 May 5 19:33:11 roadrunner slapd[1249]: conn=819 op=1 BIND dn="" method=128 May 5 19:33:11 roadrunner slapd[1249]: conn=819 op=1 RESULT tag=97 err=0 text= May 5 19:33:11 roadrunner slapd[1249]: conn=819 op=2 SRCH base="dc=spoerlein,dc=net" scope=2 deref=0 filter="(objectClass=*)" May 5 19:33:11 roadrunner slapd[1249]: conn=819 op=2 SEARCH RESULT tag=101 err=0 nentries=34 text= May 5 19:33:11 roadrunner slapd[1249]: conn=819 op=3 UNBIND May 5 19:33:11 roadrunner slapd[1249]: conn=819 fd=57 closed So, to see the problem you have to use ldap:// and/or ldaps:// URLs. and, if you're curious, you can then run while :;do id;done and see you slapd version 2.4 segfault after 10-30 seconds. Fun! Cheers, Ulrich Spoerlein
State Changed From-To: open->feedback Hi, Could you please test if 2.4.9 fixed this? There are a bunch of medium and serious priority bugs fixed in this version.
Thanks for the update. slapd seems to be running much more stable now. No change in nss_ldap behaviour though: Using ldaps:// everything is fine Using ldapi:// + start_tls id(1) stops working Using ldap:// + start_tls id(1) will only report primary group, not the secondary groups Tested on both 6.3 and 7.0 Cheers, Ulrich Spoerlein
State Changed From-To: feedback->open Submitter indicates that the problem still exists despite other improvements.
Oh boy, now since I've switched to using ldapi:// URLs I rediscovered the ldapi socket leak, that we were also experiencing with a Cyrus mailserver, which got it's user/group information from an LDAP server via nss_ldap. I can no longer bind to this socket, as the maximum has been reached: May 19 18:08:12 acme slapd[57863]: daemon: 4096 beyond descriptor table size 4096 root@acme: ~# lsof -p 57863 | awk '$5 == "unix" {i+=1} END{print i}' 4071 Restarting slapd of course works around the problem, but this is just another annoyance with nss_ldap. I think there's a re-written one in the ports, which I'm going to try next. root@acme: ~# netstat -an -funix | head -20 Active UNIX domain sockets Address Type Recv-Q Send-Q Inode Conn Refs Nextref Addr c964d7e0 stream 0 0 0 c9587e10 0 0 /var/run/openldap/ldapi c9587e10 stream 0 0 0 c964d7e0 0 0 c9581090 stream 0 0 0 ca1825a0 0 0 /var/run/openldap/ldapi ca1825a0 stream 0 0 0 c9581090 0 0 c9f78870 stream 0 0 0 ca182240 0 0 ca182240 stream 0 0 0 c9f78870 0 0 c96cf5a0 stream 0 0 0 ca233750 0 0 /var/run/openldap/ldapi ca233750 stream 0 0 0 c96cf5a0 0 0 c96d2cf0 stream 0 0 0 ca189990 0 0 /var/run/openldap/ldapi ca189990 stream 0 0 0 c96d2cf0 0 0 ca195750 stream 0 0 0 c9e376c0 0 0 /var/run/openldap/ldapi c9e376c0 stream 0 0 0 ca195750 0 0 ca1953f0 stream 0 0 0 ca2337e0 0 0 /var/run/openldap/ldapi ca2337e0 stream 0 0 0 ca1953f0 0 0 c55e1090 stream 0 0 0 c9f79e10 0 0 /var/run/openldap/ldapi c9f79e10 stream 0 0 0 c55e1090 0 0 ca182900 stream 0 0 0 c741ebd0 0 0 /var/run/openldap/ldapi c741ebd0 stream 0 0 0 ca182900 0 0 I'm CC'ing Robert Watson, as he might be interessted in the UNIX socket issue. I will try reproducing it on RELENG_7. Cheers, Ulrich Spoerlein -- It is better to remain silent and be thought a fool, than to speak, and remove all doubt.
Hello! I've recently digged into nss_ldap-related assertions we are observing periodically in our environment: Assertion failed: (r != NULL), function ldap_parse_result, file error.c, line 272. Abort trap: 6 They are indeed start_tls related. The reason is insufficient error checking in ldap-nss.c's do_start_tls(). Patch is fairy trivial: --- ldap-nss.c.backup Thu Nov 6 23:59:33 2008 +++ ldap-nss.c Fri Nov 7 00:05:20 2008 @@ -1387,7 +1387,7 @@ } rc = ldap_result (session->ls_conn, msgid, 1, timeout, &res); - if (rc == -1) + if (rc <= 0) { #if defined(HAVE_LDAP_GET_OPTION) && defined(LDAP_OPT_ERROR_NUMBER) if (ldap_get_option (session->ls_conn, LDAP_OPT_ERROR_NUMBER, &rc) != LDAP_SUCCESS) Anyway, this part of code seems to be rewritten in nss_ldap 1.264 and update should resolve the issue too (see ports/129030). Maxim Dounin
According to user reports this gets fixed in ports/129445 (was: ports/129030).
mm 2008-12-10 16:11:25 UTC FreeBSD ports repository Modified files: net/nss_ldap Makefile distinfo net/nss_ldap/files patch-ldap-pwd.c Added files: net/nss_ldap/files patch-Makefile.am patch-configure.in Removed files: net/nss_ldap/files patch-Makefile.in patch-configure Log: - Update to 1.264 [1] - use more autotools [2] - fixes assertion problems related to openldap 2.4 [3] PR: ports/129445 [1], ports/127675 [2], ports/122750 [3] Submitted by: mm [1], "Eugene M. Kim" <gene@nttmcl.com> [2] Approved by: maintainer (timeout ports/127675, ports/129030, ports/127675) Revision Changes Path 1.26 +5 -3 ports/net/nss_ldap/Makefile 1.15 +3 -3 ports/net/nss_ldap/distinfo 1.1 +39 -0 ports/net/nss_ldap/files/patch-Makefile.am (new) 1.8 +0 -82 ports/net/nss_ldap/files/patch-Makefile.in (dead) 1.6 +0 -89 ports/net/nss_ldap/files/patch-configure (dead) 1.1 +26 -0 ports/net/nss_ldap/files/patch-configure.in (new) 1.3 +3 -3 ports/net/nss_ldap/files/patch-ldap-pwd.c _______________________________________________ cvs-all@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/cvs-all To unsubscribe, send any mail to "cvs-all-unsubscribe@freebsd.org"
State Changed From-To: open->closed Fixed in ports/129445.