Bug 244899 - zfs: xattr on a symlink target > 136 causes "bad file descriptor" (on 12.1) and panic on (13 CURRENT) in sa_build_index()
Summary: zfs: xattr on a symlink target > 136 causes "bad file descriptor" (on 12.1) ...
Status: Closed Overcome By Events
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: CURRENT
Hardware: Any Any
: --- Affects Some People
Assignee: freebsd-fs (Nobody)
URL:
Keywords: crash, needs-qa
Depends on:
Blocks:
 
Reported: 2020-03-18 22:55 UTC by y.freebsd
Modified: 2023-09-27 11:07 UTC (History)
10 users (show)

See Also:
koobs: mfc-stable12?


Attachments
zfs_vnops.c: VN_OPEN_INVFS (532 bytes, patch)
2020-05-10 07:45 UTC, y.freebsd
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description y.freebsd 2020-03-18 22:55:33 UTC
on ZFS, if a symlink target has more than 136 characters setting a xattr on the symlink is broken. 

on 12.1 this causes that on the next mount of the dataset (or after not `stat`ing the file for a while) the symlink returns "bad file descriptor".

on 13 CURRENT the call to `setextattr` causes a kernel panic.

Steps to Reproduce:

root@freebsd:~ # zpool create tester /dev/ada1
root@freebsd:~ # zfs create tester/test
root@freebsd:~ # ln -s AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA /tester/test/A
root@freebsd:~ # setextattr -h user A A /tester/test/A


on CURRENT this will cause a panic.
for 12.1:

root@freebsd:~ # zfs unmount tester/test
root@freebsd:~ # zfs mount tester/test
root@freebsd:~ # ls -l /tester/test/                                                                                  ls: A: Bad file descriptor
total 0

kernel traceback:

panic: solaris assert: IS_SA_BONUSTYPE(bonustype) && SA_HDR_SIZE_MATCH_LAYOUT(hdr, tb) || !IS_SA_BONUSTYPE(bonustype) || (IS_SA_BONUSTYPE(bonustype) && hdr->sa_layout_info == 0), file: /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sa.c, line: 1512
cpuid = 0
time = 1584572024
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe0025270530
vpanic() at vpanic+0x182/frame 0xfffffe0025270580
panic() at panic+0x43/frame 0xfffffe00252705e0
assfail() at assfail+0x1a/frame 0xfffffe00252705f0
sa_build_index() at sa_build_index+0x170/frame 0xfffffe0025270700
sa_build_layouts() at sa_build_layouts+0xc22/frame 0xfffffe0025270810
sa_modify_attrs() at sa_modify_attrs+0x4db/frame 0xfffffe0025270910
sa_attr_op() at sa_attr_op+0x4e2/frame 0xfffffe00252709b0
sa_bulk_update_impl() at sa_bulk_update_impl+0xa5/frame 0xfffffe00252709f0
sa_update() at sa_update+0x55/frame 0xfffffe0025270a40
zfs_make_xattrdir() at zfs_make_xattrdir+0x20a/frame 0xfffffe0025270ae0
zfs_get_xattrdir() at zfs_get_xattrdir+0xc1/frame 0xfffffe0025270bf0
zfs_lookup() at zfs_lookup+0x15b/frame 0xfffffe0025270cd0
zfs_setextattr() at zfs_setextattr+0x1ca/frame 0xfffffe0025271000
VOP_SETEXTATTR_APV() at VOP_SETEXTATTR_APV+0x38/frame 0xfffffe0025271020
extattr_set_vp() at extattr_set_vp+0x11d/frame 0xfffffe00252710f0
kern_extattr_set_path() at kern_extattr_set_path+0x10c/frame 0xfffffe0025271330
sys_extattr_set_link() at sys_extattr_set_link+0x29/frame 0xfffffe0025271350
amd64_syscall() at amd64_syscall+0x16d/frame 0xfffffe0025271470
fast_syscall_common() at fast_syscall_common+0x101/frame 0xfffffe0025271470
--- syscall (412, FreeBSD ELF64, sys_extattr_set_link), rip = 0x8002e4eea, rsp = 0x7fffffffd9e8, rbp = 0x7fffffffea90 ---
KDB: enter: panic
[ thread pid 690 tid 100429 ]
Stopped at      kdb_enter+0x37: movq    $0,0x10928e6(%rip)
db> 


if there is any other information i can provide please let me know
Comment 1 y.freebsd 2020-03-22 07:32:53 UTC
when compiling the kernel with these flags (for debugging) the issue goes away

makeoptions 	BUILD_OPTIMIZED=NO
makeoptions 	COPTFLAGS=-O0


this looks like a compiler optimization bug
Comment 2 y.freebsd 2020-05-10 04:04:49 UTC
happens on FreeBSD-11.4-BETA1 and FreeBSD-11.3-RELEASE like on 12.1
Comment 3 y.freebsd 2020-05-10 04:12:36 UTC
Does not happen on FreeBSD-10.4-RELEASE
Comment 4 y.freebsd 2020-05-10 07:45:31 UTC
Created attachment 214336 [details]
zfs_vnops.c: VN_OPEN_INVFS

might be connected (not sure yet) but 
https://github.com/freebsd/freebsd/commit/b4c6542df36523dd75bded6094182c1440a8d076 has a bug where VN_OPEN_INVFS is passed as the wrong agument.

patch attached.
Comment 5 y.freebsd 2020-05-10 19:01:02 UTC
to trigger the bug the target also has to be less than 144 characters.

I noticed that when this bug is triggered on FreeBSD 11 & 12

when running ln -s
then stat -h returns normally
when i the run setextattr and rerun stat the codify change and birth times are all wrong (they are the same random time) however atime is correct.

there is probably some sort of buffer overflow corrupting the file
Comment 6 y.freebsd 2020-05-13 00:22:16 UTC
after a lot of git bisect the bug is caused by 
r294812
https://svnweb.freebsd.org/base?view=revision&revision=294812
https://github.com/freebsd/freebsd/commit/71d7abc46e4defbcf77033c417935b944c13084a

this was fixed in ZoL in https://github.com/openzfs/zfs/commit/83021b47c2870c0ba948cbcfe08f41bd7730f5fb
with a related fix https://github.com/openzfs/zfs/commit/a62d1b02e372e63862cee276185f2763f641ff10
applying these patches fixes the issue.

please apply these patches to freebsd 11 12 & 13

thanks
Comment 7 Sean Champ 2022-09-21 08:50:19 UTC
I think this may have cleared up at least in FreeBSD 13.1 (changeset b63021e001d)? e.g uname

FreeBSD sol.cloud.thinkum.space 13.1-STABLE FreeBSD 13.1-STABLE #0 build/stable/13-n252436-b63021e001d: Sun Sep 18 22:01:30 PDT 2022     me@sol.cloud.thinkum.space:[sic] amd64


With an earlier build of FreeBSD 13.1 (changeset 41ce229505a) I'd encountered what I think was the same error as here, when building some ports with poudriere on ZFS. 

With that earlier FreeBSD 13.1 build, the uname:
FreeBSD sol.cloud.thinkum.space 13.1-STABLE FreeBSD 13.1-STABLE #0 build/stable/13-n251001-41ce229505a: Sat Jun  4 18:12:09 PDT 2022     me@sol.cloud.thinkum.space:[sic] amd64

The "Bad file descriptor" error, when building lang/rust under Poudriere on ZFS on that kernel and base system:

[...]
-- Installing: /wrkdirs/usr/ports/lang/rust/work/_build/x86_64-unknown-freebsd/llvm/include/llvm/MC/MCObjectFileInfo.h
CMake Error at cmake_install.cmake:41 (file):
  file INSTALL cannot copy file
  "/wrkdirs/usr/ports/lang/rust/work/rustc-1.63.0-src/src/llvm-project/llvm/include/llvm/MC/MCObjectFileInfo.h"
  to
  "/wrkdirs/usr/ports/lang/rust/work/_build/x86_64-unknown-freebsd/llvm/include/llvm/MC/MCObjectFileInfo.h":
  Bad file descriptor.


FAILED: CMakeFiles/install.util 
cd /wrkdirs/usr/ports/lang/rust/work/_build/x86_64-unknown-freebsd/llvm/build && /usr/local/bin/cmake -P cmake_install.cmake
ninja: build stopped: subcommand failed.


I'd seen a similar error with a number of ports - "Bad file descriptor" - before stopping the port upgrade.

Now having updated the local 13.1 build to changeset b63021e001d under stable/13 branch, I've not seen any such error when building rust now. Of course, it will be a few more hours before the entire port upgrade completes, if it could serve any test of the issue.

If it may show up again, but I'll look at adding those options to my KERNCONF and rebuilding the kernel. Thx!
Comment 8 Sean Champ 2022-10-22 14:15:04 UTC
For what it's worth, I started seeing this error again, for a significant number of port builds, using zfs in poudriere. Example from an llvm13 build, during the build-depends stage:
~~~~
===>   llvm13-13.0.1_3 depends on executable: ninja - not found
===>   Installing existing package /packages/All/ninja-1.11.1,2.pkg
[xmin.bld.cloud.thinkum.space] Installing ninja-1.11.1,2...
[xmin.bld.cloud.thinkum.space] `-- Installing python38-3.8.15...
[xmin.bld.cloud.thinkum.space] |   `-- Installing libffi-3.4.3...
[xmin.bld.cloud.thinkum.space] |   `-- Extracting libffi-3.4.3: .......... done
[xmin.bld.cloud.thinkum.space] |   `-- Installing mpdecimal-2.5.1...
[xmin.bld.cloud.thinkum.space] |   `-- Extracting mpdecimal-2.5.1: .......... done
[xmin.bld.cloud.thinkum.space] |   `-- Installing readline-8.1.2...
[xmin.bld.cloud.thinkum.space] |   `-- Extracting readline-8.1.2: .......... done
[xmin.bld.cloud.thinkum.space] `-- Extracting python38-3.8.15: ...
pkg-static: Fail to chown /usr/local/lib/python3.8/idlelib/idle_test/.pkgtemp.test_squeezer.py.bFIrNzt0LFou:Bad file descriptor
~~~~

uname in this instance:
~~~~
FreeBSD xmin.cloud.thinkum.space 13.1-STABLE FreeBSD 13.1-STABLE #0 build/stable/13-n252436-b63021e001d: Sun Oct  9 05:52:29 PDT 2022     gimbal@xmin.cloud.thinkum.space:/usr/obj/xmin_FreeBSD-13.1-STABLE_amd64/usr/src/amd64.amd64/sys/XMIN amd64
~~~~

i.e kernel was built from changeset: b63021e001d in the stable/13 branch. 

If it's not an absurd question, could this be related to to the number of open file descriptors? 

At the time of the errors - with some hundred+ similar build failures while using zfs in poudriere - I'd noticed that the single user gvfsd-trash process had approx 10603 file descriptors showing under 'fstat -p'. I've terminated this process and tried the poudriere build again. Those errors as I was seeing here are not showing up now.

As a workaround, I'm going to use the following to try to disable gvfsd-trash in effect:

~~~~
mkdir -p /usr/local/etc/gvfs/mounts
for F in /usr/local/share/gvfs/mounts/*.mount; do
    install -l rs $F /usr/local/etc/gvfs/mounts/$(basename $F);
done
rm /usr/local/etc/gvfs/mounts/trash.mount
~~~~

then in /etc/profile
~~~~
if [ -e /usr/local/etc/gvfs/mounts ]; then
  export GVFS_MOUNTABLE_DIR=/usr/local/etc/gvfs/mounts
fi
~~~~

Of course, this would not guarantee that /usr/local/etc/gvfs/mounts would stay in sync with /usr/local/share/gvfs/mounts/ then. Alternately, one could patch the devel/gvfs port to add an option to remove trash.mount before packaging

This correlation may not illustrate a causal relation - e.g between the number of file descriptors in the gvfsd-trash process and the build failures under poudriere with ZFS. After closing then disabling gvfsd-trash, I'm not seeing those failures now, in the local ports build.

I'll try rebuilding the kernel with those options. My kernconf from sysctl kern.conftxt:
~~~~
kern.conftxt: options	CONFIG_AUTOGENERATED
ident	XMIN
machine	amd64
cpu	HAMMER
cpu	HAMMER
cpu	HAMMER
makeoptions	MODULES_EXTRA=acpi/acpi_rapidstart
makeoptions	WITH_CTF=1
makeoptions	DEBUG=-g
options	IPI_PREEMPTION
options	MSDOSFS_ICONV
options	CD9660_ICONV
options	LIBICONV
options	FDESCFS
options	FUSEFS
options	GEOM_PART_VTOC8
options	GEOM_PART_LDM
options	GEOM_PART_BSD64
options	GEOM_PART_APM
options	X86BIOS
options	ENABLE_ALART
options	DUMMYNET
options	VT_ALT_TO_ESC_HACK=1
options	MSGBUF_SIZE=(32*PAGE_SIZE)
options	PANIC_REBOOT_WAIT_TIME=-1
options	HZ=1000
options	IICHID_SAMPLING
options	HID_DEBUG
options	EVDEV_SUPPORT
options	XENHVM
options	USB_DEBUG
options	ATH_ENABLE_11N
options	AH_AR5416_INTERRUPT_MITIGATION
options	IEEE80211_SUPPORT_MESH
options	IEEE80211_DEBUG
options	SC_PIXEL_MODE
options	VESA
options	PPS_SYNC
options	COMPAT_LINUXKPI
options	PCI_IOV
options	PCI_HP
options	IOMMU
options	EARLY_AP_STARTUP
options	SMP
options	NETGDB
options	NETDUMP
options	DEBUGNET
options	ZSTDIO
options	GZIO
options	EKCD
options	KDB_TRACE
options	KDB
options	RCTL
options	RACCT_DEFAULT_TO_DISABLED
options	RACCT
options	INCLUDE_CONFIG_FILE
options	DDB_CTF
options	KDTRACE_HOOKS
options	KDTRACE_FRAME
options	MAC
options	CAPABILITIES
options	CAPABILITY_MODE
options	AUDIT
options	KBD_INSTALL_CDEV
options	PRINTF_BUFR_SIZE=128
options	_KPOSIX_PRIORITY_SCHEDULING
options	SYSVSEM
options	SYSVMSG
options	SYSVSHM
options	STACK
options	KTRACE
options	SCSI_DELAY=5000
options	COMPAT_FREEBSD12
options	COMPAT_FREEBSD11
options	COMPAT_FREEBSD10
options	COMPAT_FREEBSD9
options	COMPAT_FREEBSD7
options	COMPAT_FREEBSD6
options	COMPAT_FREEBSD5
options	COMPAT_FREEBSD4
options	COMPAT_FREEBSD32
options	EFIRT
options	GEOM_LABEL
options	GEOM_RAID
options	TMPFS
options	PSEUDOFS
options	PROCFS
options	CD9660
options	MSDOSFS
options	NFS_ROOT
options	NFSLOCKD
options	NFSD
options	NFSCL
options	MD_ROOT
options	QUOTA
options	UFS_GJOURNAL
options	UFS_DIRHASH
options	UFS_ACL
options	SOFTUPDATES
options	FFS
options	KERN_TLS
options	SCTP_SUPPORT
options	TCP_RFC7413
options	TCP_HHOOK
options	TCP_BLACKBOX
options	TCP_OFFLOAD
options	FIB_ALGO
options	ROUTE_MPATH
options	IPSEC_SUPPORT
options	INET6
options	INET
options	VIMAGE
options	PREEMPTION
options	NUMA
options	SCHED_ULE
options	NEW_PCIB
options	GEOM_PART_GPT
options	GEOM_PART_MBR
options	GEOM_PART_EBR
options	GEOM_PART_BSD
options	GEOM_PART_BSD
options	GEOM_PART_EBR
options	GEOM_PART_MBR
options	GEOM_PART_GPT
options	NEW_PCIB
options	SCHED_ULE
options	NUMA
options	PREEMPTION
options	VIMAGE
options	INET
options	INET6
options	IPSEC_SUPPORT
options	ROUTE_MPATH
options	FIB_ALGO
options	TCP_OFFLOAD
options	TCP_BLACKBOX
options	TCP_HHOOK
options	TCP_RFC7413
options	SCTP_SUPPORT
options	KERN_TLS
options	FFS
options	SOFTUPDATES
options	UFS_ACL
options	UFS_DIRHASH
options	UFS_GJOURNAL
options	QUOTA
options	MD_ROOT
options	NFSCL
options	NFSD
options	NFSLOCKD
options	NFS_ROOT
options	MSDOSFS
options	CD9660
options	PROCFS
options	PSEUDOFS
options	TMPFS
options	GEOM_RAID
options	GEOM_LABEL
options	EFIRT
options	COMPAT_FREEBSD32
options	COMPAT_FREEBSD4
options	COMPAT_FREEBSD5
options	COMPAT_FREEBSD6
options	COMPAT_FREEBSD7
options	COMPAT_FREEBSD9
options	COMPAT_FREEBSD10
options	COMPAT_FREEBSD11
options	COMPAT_FREEBSD12
options	SCSI_DELAY=5000
options	KTRACE
options	STACK
options	SYSVSHM
options	SYSVMSG
options	SYSVSEM
options	_KPOSIX_PRIORITY_SCHEDULING
options	PRINTF_BUFR_SIZE=128
options	KBD_INSTALL_CDEV
options	AUDIT
options	CAPABILITY_MODE
options	CAPABILITIES
options	MAC
options	KDTRACE_FRAME
options	KDTRACE_HOOKS
options	DDB_CTF
options	INCLUDE_CONFIG_FILE
options	RACCT
options	RACCT_DEFAULT_TO_DISABLED
options	RCTL
options	KDB
options	KDB_TRACE
options	EKCD
options	GZIO
options	ZSTDIO
options	DEBUGNET
options	NETDUMP
options	NETGDB
options	SMP
options	EARLY_AP_STARTUP
options	IOMMU
options	PCI_HP
options	PCI_IOV
options	COMPAT_LINUXKPI
options	PPS_SYNC
options	VESA
options	SC_PIXEL_MODE
options	IEEE80211_DEBUG
options	IEEE80211_SUPPORT_MESH
options	AH_AR5416_INTERRUPT_MITIGATION
options	ATH_ENABLE_11N
options	USB_DEBUG
options	XENHVM
options	EVDEV_SUPPORT
options	HID_DEBUG
options	IICHID_SAMPLING
options	SCHED_ULE
options	NUMA
options	PREEMPTION
options	VIMAGE
options	INET
options	INET6
options	IPSEC_SUPPORT
options	ROUTE_MPATH
options	FIB_ALGO
options	TCP_OFFLOAD
options	TCP_BLACKBOX
options	TCP_HHOOK
options	TCP_RFC7413
options	SCTP_SUPPORT
options	KERN_TLS
options	FFS
options	SOFTUPDATES
options	UFS_ACL
options	UFS_DIRHASH
options	UFS_GJOURNAL
options	QUOTA
options	MD_ROOT
options	NFSCL
options	NFSD
options	NFSLOCKD
options	NFS_ROOT
options	MSDOSFS
options	CD9660
options	PROCFS
options	PSEUDOFS
options	TMPFS
options	GEOM_RAID
options	GEOM_LABEL
options	EFIRT
options	COMPAT_FREEBSD32
options	COMPAT_FREEBSD4
options	COMPAT_FREEBSD5
options	COMPAT_FREEBSD6
options	COMPAT_FREEBSD7
options	COMPAT_FREEBSD9
options	COMPAT_FREEBSD10
options	COMPAT_FREEBSD11
options	COMPAT_FREEBSD12
options	SCSI_DELAY=5000
options	KTRACE
options	STACK
options	SYSVSHM
options	SYSVMSG
options	SYSVSEM
options	_KPOSIX_PRIORITY_SCHEDULING
options	PRINTF_BUFR_SIZE=128
options	KBD_INSTALL_CDEV
options	AUDIT
options	CAPABILITY_MODE
options	CAPABILITIES
options	MAC
options	KDTRACE_FRAME
options	KDTRACE_HOOKS
options	DDB_CTF
options	INCLUDE_CONFIG_FILE
options	RACCT
options	RACCT_DEFAULT_TO_DISABLED
options	RCTL
options	KDB
options	KDB_TRACE
options	EKCD
options	GZIO
options	ZSTDIO
options	DEBUGNET
options	NETDUMP
options	NETGDB
options	SMP
options	EARLY_AP_STARTUP
options	IOMMU
options	PCI_HP
options	PCI_IOV
options	COMPAT_LINUXKPI
options	PPS_SYNC
options	VESA
options	SC_PIXEL_MODE
options	IEEE80211_DEBUG
options	IEEE80211_SUPPORT_MESH
options	AH_AR5416_INTERRUPT_MITIGATION
options	ATH_ENABLE_11N
options	USB_DEBUG
options	XENHVM
options	EVDEV_SUPPORT
options	HID_DEBUG
options	IICHID_SAMPLING
options	HZ=1000
options	PANIC_REBOOT_WAIT_TIME=-1
options	MSGBUF_SIZE=(32*PAGE_SIZE)
options	HZ=1000
device	isa
device	mem
device	io
device	uart_ns8250
device	cpufreq
device	acpi
device	smbios
device	pci
device	fdc
device	ahci
device	ata
device	mvs
device	siis
device	ahc
device	ahd
device	esp
device	hptiop
device	isp
device	mpt
device	mps
device	mpr
device	sym
device	isci
device	ocs_fc
device	pvscsi
device	scbus
device	ch
device	da
device	sa
device	cd
device	pass
device	ses
device	amr
device	arcmsr
device	ciss
device	iir
device	ips
device	mly
device	twa
device	smartpqi
device	tws
device	aac
device	aacp
device	aacraid
device	ida
device	mfi
device	mlx
device	mrsas
device	pmspcv
device	twe
device	nvme
device	nvd
device	vmd
device	atkbdc
device	atkbd
device	psm
device	kbdmux
device	vga
device	splash
device	sc
device	vt
device	vt_vga
device	vt_efifb
device	vt_vbefb
device	agp
device	cbb
device	pccard
device	cardbus
device	uart
device	ppc
device	ppbus
device	lpt
device	ppi
device	puc
device	iflib
device	em
device	igc
device	ix
device	ixv
device	ixl
device	iavf
device	ice
device	vmx
device	axp
device	bxe
device	le
device	ti
device	mlx5
device	mlxfw
device	mlx5en
device	miibus
device	ae
device	age
device	alc
device	ale
device	bce
device	bfe
device	bge
device	cas
device	dc
device	et
device	fxp
device	gem
device	jme
device	lge
device	msk
device	nfe
device	nge
device	re
device	rl
device	sge
device	sis
device	sk
device	ste
device	stge
device	vge
device	vr
device	xl
device	wlan
device	wlan_wep
device	wlan_ccmp
device	wlan_tkip
device	wlan_amrr
device	an
device	ath
device	ath_pci
device	ath_hal
device	ath_rate_sample
device	ipw
device	iwi
device	iwn
device	malo
device	mwl
device	ral
device	wpi
device	crypto
device	aesni
device	loop
device	padlock_rng
device	rdrand_rng
device	ether
device	vlan
device	tuntap
device	md
device	gif
device	firmware
device	xz
device	bpf
device	uhci
device	ohci
device	ehci
device	xhci
device	usb
device	ukbd
device	umass
device	sound
device	snd_cmi
device	snd_csa
device	snd_emu10kx
device	snd_es137x
device	snd_hda
device	snd_ich
device	snd_via8233
device	mmc
device	mmcsd
device	sdhci
device	rtsx
device	virtio
device	virtio_pci
device	vtnet
device	virtio_blk
device	virtio_scsi
device	virtio_balloon
device	kvm_clock
device	hyperv
device	xenpci
device	netmap
device	evdev
device	uinput
device	hid
device	smbus
device	smb
device	intpm
device	imcsmb
device	ipmi
device	nvram
device	dpms
device	atpic
device	mptable
device	acpi_hp
~~~

This kernel build had also used the following, in files locally included under /usr/src/sys/amd64/conf/

~~~~
nooptions 	WITNESS
nooptions 	WITNESS_KDB
nooptions 	WITNESS_SKIPSPIN
nooptions 	LOCK_PROFILING
nooptions	CALLOUT_PROFILING
nooptions 	SLEEPQUEUE_PROFILING
nooptions 	TURNSTILE_PROFILING
nooptions 	UMTX_PROFILING
nooptions 	MBUF_PROFILING
nooptions       INVARIANTS
nooptions       INVARIANT_SUPPORT
~~~~

I'll try adding the options recommended above
~~~~
makeoptions 	BUILD_OPTIMIZED=NO
makeoptions 	COPTFLAGS=-O0
~~~~
Comment 9 Sean Champ 2022-10-23 03:57:44 UTC
After rebuilding the kernel from changeset b63021e001d in stable/13 with the additional makeoptions, still seeing the similar (?) error in port builds with ZFS in Poudriere

~~~~
===>   Returning to build of binutils-2.37_4,1
===>   binutils-2.37_4,1 depends on executable: msgfmt - not found
===>   Installing existing package /packages/All/gettext-tools-0.21_1.pkg
[xmin.bld.cloud.thinkum.space] Installing gettext-tools-0.21_1...
[xmin.bld.cloud.thinkum.space] `-- Installing libtextstyle-0.21...
[xmin.bld.cloud.thinkum.space] `-- Extracting libtextstyle-0.21: .......... done
[xmin.bld.cloud.thinkum.space] Extracting gettext-tools-0.21_1: ....
pkg-static: Fail to chown /usr/local/share/doc/gettext/examples/hello-java-awt/po/.pkgtemp.zh_HK.po.BfvtcrBEkY7V:Bad file descriptor
[xmin.bld.cloud.thinkum.space] Extracting gettext-tools-0.21_1... done

Failed to install the following 1 package(s): /packages/All/gettext-tools-0.21_1.pkg
*** Error code 1

Stop.
make: stopped in /usr/ports/devel/binutils
~~~~

Albeit, the symlink example above is not erring here. 

I've noticed the bug - as above - at some points, with symlink files. Previously, it was happening during pkg staging.

Once installed under a builder jail, the file /usr/local/share/doc/gettext/examples/hello-java-awt/po/zh_HK.po is not a symlink.

After some previous occurrences of the bug, during pkg staging in poudriere, I'd thought that it might be related to the item above. Maybe it's a separate bug of some kind.

The 'Bad file descriptor' message continues to show up in port builds locally, when using ZFS in poudriere. There's a workaround in using tmpfs for all builder filesystems in poudriere builds. 

I'll try to make an isolated test case for the bug that I'm seeing with my local FreeBSD build - something beyond the bad FD messages here.
Comment 10 Jimmy Olgeni freebsd_committer freebsd_triage 2023-06-19 15:38:00 UTC
Just a quick note that I'm seeing this regularly on some VMs, and physical hosts to a lesser degree.

For some reason running "pkg upgrade" on py39-ansible seems to be a sure way to trigger this on the affected boxes.

===

The following 1 package(s) will be affected (of 0 checked):

Installed packages to be UPGRADED:
        py39-ansible: 7.1.0 -> 7.6.0

Number of packages to be upgraded: 1

The process will require 10 MiB more space.
[1/1] Upgrading py39-ansible from 7.1.0 to 7.6.0...
[1/1] Extracting py39-ansible-7.6.0:  38%
pkg: Fail to chmod /usr/local/lib/python3.9/site-packages/ansible_collections/community/general/plugins/modules/__pycache__/.pkgtemp.django_manage.cpython-39.pyc.ub2f8mxmwp9S:Bad file descriptor
[1/1] Extracting py39-ansible-7.6.0: 100%

===

The affected path is never the same.

I ran "script LOG truss -f pkg upgrade -y py39-ansible" in the hope of getting useful data - it was very slow but it worked on the first try, so there may be some timing issue involved? :|
Comment 11 sfourman 2023-08-27 07:06:05 UTC
I am also seeing this on multiple ports building from scratch on FreeBSD 14 ALPHA2

uname -a
FreeBSD ThinkBSD 14.0-ALPHA2 FreeBSD 14.0-ALPHA2 amd64 1400094 #0 main-n264887-332af8c25dfc: Sat Aug 19 20:07:28 EDT 2023     root@ThinkBSD:/usr/obj/usr/src/amd64.amd64/sys/GENERIC amd64



===>  Cleaning for rust-1.71.0
===>  License APACHE20 MIT accepted by the user
===>   rust-1.71.0 depends on file: /usr/local/sbin/pkg - found
===> Fetching all distfiles required by rust-1.71.0 for building
===>  Extracting for rust-1.71.0
=> SHA256 Checksum OK for rust/rustc-1.71.0-src.tar.xz.
=> SHA256 Checksum OK for rust/2023-06-01/rustc-1.70.0-x86_64-unknown-freebsd.tar.xz.
=> SHA256 Checksum OK for rust/2023-06-01/rust-std-1.70.0-x86_64-unknown-freebsd.tar.xz.
=> SHA256 Checksum OK for rust/2023-06-01/cargo-1.70.0-x86_64-unknown-freebsd.tar.xz.
chmod: /usr/ports/lang/rust/work/rustc-1.71.0-src/vendor/curl-sys/curl/plan9/src/mkfile: Bad file descriptor
*** Error code 1

Stop.
make[1]: stopped in /usr/ports/lang/rust
*** Error code 1

Stop.
make: stopped in /usr/ports/lang/rust

===>>> make build failed for lang/rust
===>>> Aborting update

===>>> Update for lang/rust failed
===>>> Aborting update

===>>> Update for devel/rust-cbindgen failed
===>>> Aborting update

===>>> Update for www/firefox failed
===>>> Aborting update


===>>> You can restart from the point of failure with this command line:
       portmaster <flags> www/firefox devel/rust-cbindgen lang/rust devel/wasi-compiler-rt13 devel/wasi-libcxx devel/yasm devel/git@default textproc/rubygem-asciidoctor textproc/xmlto misc/getopt www/w3m devel/boehm-gc devel/libatomic_ops x11-fonts/terminus-font
Comment 12 Jimmy Olgeni freebsd_committer freebsd_triage 2023-08-30 13:42:55 UTC
Not really a solution for package builds, but I could complete my "pkg upgrade" (after 15 failures) by setting sync=always on ZFS, upgrading, and then resetting it to the default \o/
Comment 13 Jimmy Olgeni freebsd_committer freebsd_triage 2023-09-07 08:09:28 UTC
One more point - I observed the same issue in a VM with "low" memory, with 2GBs and very small swap usage around 6MB.

Pkg upgrade kept failing. Then I stopped mysql, gained some memory, temporarily disabled swap for good measure, and the process could complete.

Apparently there is something related to low memory and writing a lot of files / files with long paths (pkg usually fails on upgrading ansible packages).
Comment 14 Eugene M. Zheganin 2023-09-27 06:15:35 UTC
Seems like most people stepping on this have nothing to do with storing ports on a symlinked tree. This is the only bug reported so far about this issue, thus it's synopsis is misleading.

I've observed this on a 13.1 and 13.2, my ports tree is not symlinked, and this happens to me on every large port like devel/llvm13 or www/firefox and is 100% reproducible on "make extract", though the files and their number differ each time. 

As stated earlier the only decent workaround so far is to build using tmpfs.
This state can also be cleared for a while (several hours) by a reboot, but then it always comes back.

There is an opinion that the latter could indicate that the issue lies in the vnode cache.
Comment 15 y.freebsd 2023-09-27 11:05:17 UTC
Fixed by the move to ZOL based ZFS.