Bug 269823 - rand_harvest produces 100%CPU on 1 CPU with virtio_random.ko loaded
Summary: rand_harvest produces 100%CPU on 1 CPU with virtio_random.ko loaded
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: CURRENT
Hardware: Any Any
: --- Affects Only Me
Assignee: John Baldwin
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2023-02-25 11:34 UTC by Mina Galić
Modified: 2024-01-05 17:16 UTC (History)
8 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Mina Galić freebsd_triage 2023-02-25 11:34:01 UTC
The fact that it reboots just fine, suggests that https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=253175 is indeed fixed, but some other underlying issue is not.

Here's `hotkernel`, until I can find something more useful:

Sampling... Hit Ctrl-C to end.
^C
FUNCTION                                                COUNT   PCNT
dtrace.ko`dtrace_dynvar_clean                               3   0.0%
kernel`0xffffffff81                                         4   0.0%
kernel`pagezero_erms                                        8   0.0%
kernel`hpet_get_timecount                                  18   0.0%
kernel`lock_delay                                          22   0.0%
kernel`spinlock_exit                                      575   0.9%
kernel`kobj_error_method                                  817   1.2%
kernel`virtqueue_poll                                   32208  48.8%
kernel`acpi_cpu_c1                                      32371  49.0%
Comment 1 Mina Galić freebsd_triage 2023-02-25 11:36:37 UTC
one more data-point: trying to `kldunload virtio_random.ko` hangs indefinitely.
Which is inline with the nature of the fix of https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=253175 it moved de-initialization of the module as far towards the end as possible.
Comment 2 Mina Galić freebsd_triage 2023-02-25 19:53:04 UTC
not much different or unexpected from procstat kstack either:

root@freebsd:~ # procstat -w 1 kstack -v 6
  PID    TID COMM                TDNAME              KSTACK
    6 100074 rand_harvestq       -                   vtrnd_read+0xa4 random_kthread+0x174 fork_exit+0x80 fork_trampoline+0xe
    6 100074 rand_harvestq       -                   vtrnd_read+0xa4 random_kthread+0x174 fork_exit+0x80 fork_trampoline+0xe
    6 100074 rand_harvestq       -                   vtrnd_read+0xa4 random_kthread+0x174 fork_exit+0x80 fork_trampoline+0xe
    6 100074 rand_harvestq       -                   vtrnd_read+0xa4 random_kthread+0x174 fork_exit+0x80 fork_trampoline+0xe
    6 100074 rand_harvestq       -                   vtrnd_read+0xa4 random_kthread+0x174 fork_exit+0x80 fork_trampoline+0xe
    6 100074 rand_harvestq       -                   vtrnd_read+0xa4 random_kthread+0x174 fork_exit+0x80 fork_trampoline+0xe
    6 100074 rand_harvestq       -                   vtrnd_read+0xa4 random_kthread+0x174 fork_exit+0x80 fork_trampoline+0xe
    6 100074 rand_harvestq       -                   vtrnd_read+0xa4 random_kthread+0x174 fork_exit+0x80 fork_trampoline+0xe
    6 100074 rand_harvestq       -                   vtrnd_read+0xa4 random_kthread+0x174 fork_exit+0x80 fork_trampoline+0xe
    6 100074 rand_harvestq       -                   vtrnd_read+0xa4 random_kthread+0x174 fork_exit+0x80 fork_trampoline+0xe
    6 100074 rand_harvestq       -                   vtrnd_read+0xa4 random_kthread+0x174 fork_exit+0x80 fork_trampoline+0xe
    6 100074 rand_harvestq       -                   vtrnd_read+0xa4 random_kthread+0x174 fork_exit+0x80 fork_trampoline+0xe
    6 100074 rand_harvestq       -                   vtrnd_read+0xa4 random_kthread+0x174 fork_exit+0x80 fork_trampoline+0xe
    6 100074 rand_harvestq       -                   vtrnd_read+0xa4 random_kthread+0x174 fork_exit+0x80 fork_trampoline+0xe
    6 100074 rand_harvestq       -                   vtrnd_read+0xa4 random_kthread+0x174 fork_exit+0x80 fork_trampoline+0xe
    6 100074 rand_harvestq       -                   vtrnd_read+0xa4 random_kthread+0x174 fork_exit+0x80 fork_trampoline+0xe
Comment 3 Mina Galić freebsd_triage 2023-02-28 12:48:07 UTC
more info on the hardware, before i try to dive into what's actually going wrong with the code:

works:

# devinfo -p vtrnd0 -v
vtrnd0 pnpinfo vendor=0x00001af4 device=0x1005 subvendor=0x1af4 device_type=0x00000004
 virtio_pci0 pnpinfo vendor=0x1af4 device=0x1005 subvendor=0x1af4 subdevice=0x0004 class=0x00ff00 at slot=2 function=0 dbsf=pci0:0:2:0 handle=\_SB_.PCI0.S10_
 pci0
 pcib0 pnpinfo _HID=PNP0A03 _UID=1 _CID=none at handle=\_SB_.PCI0
 acpi0
 nexus0

# pciconf -lv | grep -B2 -A1 RNG
virtio_pci0@pci0:0:2:0: class=0x00ff00 rev=0x00 hdr=0x00 vendor=0x1af4 device=0x1005 subvendor=0x1af4 subdevice=0x0004
    vendor     = 'Red Hat, Inc.'
    device     = 'Virtio RNG'
    class      = old

# sysctl kern.random
kern.random.fortuna.concurrent_read: 1
kern.random.fortuna.minpoolsize: 64
kern.random.rdrand.rdrand_independent_seed: 0
kern.random.use_chacha20_cipher: 1
kern.random.block_seeded_status: 0
kern.random.random_sources: 'VirtIO Entropy Adapter','Intel Secure Key RNG'
kern.random.harvest.mask_symbolic: PURE_VIRTIO,PURE_RDRAND,[CALLOUT],[UMA],[FS_ATIME],SWI,INTERRUPT,NET_NG,[NET_ETHER],NET_TUN,MOUSE,KEYBOARD,ATTACH,CACHED
kern.random.harvest.mask_bin: 00000010010000000111011111
kern.random.harvest.mask: 590303
kern.random.initial_seeding.disable_bypass_warnings: 0
kern.random.initial_seeding.arc4random_bypassed_before_seeding: 0
kern.random.initial_seeding.read_random_bypassed_before_seeding: 0
kern.random.initial_seeding.bypass_before_seeding: 1


Misbehaves:

# devinfo -p vtrnd0 -v
vtrnd0 pnpinfo vendor=0x00001af4 device=0x1044 subvendor=0x1af4 device_type=0x00000004
 virtio_pci1 pnpinfo vendor=0x1af4 device=0x1044 subvendor=0x1af4 subdevice=0x1100 class=0x00ff00 at slot=0 function=1 dbsf=pci0:1:0:1
 pci1
 pcib1 pnpinfo vendor=0x1b36 device=0x000c subvendor=0x1b36 subdevice=0x0000 class=0x060400 at slot=1 function=0 dbsf=pci0:0:1:0 handle=\_SB_.PCI0.S08_
 pci0
 pcib0 pnpinfo _HID=PNP0A08 _UID=0 _CID=PNP0A03 at handle=\_SB_.PCI0
 acpi0
 nexus0

# pciconf -lv | grep -B2 -A1 RNG
virtio_pci1@pci0:1:0:1: class=0x00ff00 rev=0x01 hdr=0x00 vendor=0x1af4 device=0x1044 subvendor=0x1af4 subdevice=0x1100
    vendor     = 'Red Hat, Inc.'
    device     = 'Virtio RNG'
    class      = old

# sysctl kern.random
kern.random.fortuna.concurrent_read: 1
kern.random.fortuna.minpoolsize: 64
kern.random.rdrand.rdrand_independent_seed: 0
kern.random.use_chacha20_cipher: 1
kern.random.block_seeded_status: 0
kern.random.random_sources: 'VirtIO Entropy Adapter','Intel Secure Key RNG'
kern.random.harvest.mask_symbolic: PURE_VIRTIO,PURE_RDRAND,[CALLOUT],[UMA],[FS_ATIME],SWI,INTERRUPT,NET_NG,[NET_ETHER],NET_TUN,MOUSE,KEYBOARD,ATTACH,CACHED
kern.random.harvest.mask_bin: 00000010010000000111011111
kern.random.harvest.mask: 590303
kern.random.initial_seeding.disable_bypass_warnings: 0
kern.random.initial_seeding.arc4random_bypassed_before_seeding: 0
kern.random.initial_seeding.read_random_bypassed_before_seeding: 0
kern.random.initial_seeding.bypass_before_seeding: 1

The only difference i can see is in the devinfo output.
Comment 4 Mina Galić freebsd_triage 2023-02-28 13:16:45 UTC
red herring from pciconf -lv, thanks to this bug: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=269874
Comment 5 Mina Galić freebsd_triage 2023-03-02 00:17:37 UTC
been debugging this on and off for the past 2 days.
at first I thought that

dtrace -n 'kinst::virtqueue_poll:' output was not not very useful, until I realized that it constantly repeats:

  1  73355               virtqueue_poll:112 
  1  73356               virtqueue_poll:115 
  1  73357               virtqueue_poll:118 
  1  73358               virtqueue_poll:124 
  1  73359               virtqueue_poll:129 
  1  73360               virtqueue_poll:133 
  1  73361               virtqueue_poll:135 
  1  73362               virtqueue_poll:137 
  1  73363               virtqueue_poll:141 
  1  73364               virtqueue_poll:144 
  1  73365               virtqueue_poll:151 
  1  73366               virtqueue_poll:155 
  1  73367               virtqueue_poll:158 
  1  73355               virtqueue_poll:112 
  1  73356               virtqueue_poll:115 
  1  73357               virtqueue_poll:118 
  1  73358               virtqueue_poll:124 
  1  73359               virtqueue_poll:129 
  1  73360               virtqueue_poll:133 
  1  73361               virtqueue_poll:135 
  1  73362               virtqueue_poll:137 
  1  73363               virtqueue_poll:141 
  1  73364               virtqueue_poll:144 
  1  73365               virtqueue_poll:151 
  1  73366               virtqueue_poll:155 
  1  73367               virtqueue_poll:158 


etc…
This function never returns, we don't even get (back) into vtrnd_read:

dtrace -n 'kinst::vtrnd_read:'
dtrace: description 'kinst::vtrnd_read:' matched 88 probes
… # this never prints anything on this machine

So let's look at those instructions:

0xffffffff80a113e0 <+112>:   48 89 df        mov    %rbx,%rdi
0xffffffff80a113e3 <+115>:   ff 50 08        call   *0x8(%rax)
0xffffffff80a113e6 <+118>:   41 0f b7 4c 24 56       movzwl 0x56(%r12),%ecx
0xffffffff80a113ec <+124>:   49 8b 44 24 48  mov    0x48(%r12),%rax
0xffffffff80a113f1 <+129>:   66 3b 48 02     cmp    0x2(%rax),%cx
0xffffffff80a113f5 <+133>:	75 32	jne    0xffffffff80a11429 <virtqueue_poll+185>
0xffffffff80a113f7 <+135>:	f3 90	pause  
0xffffffff80a113f9 <+137>:	49 8b 1c 24	mov    (%r12),%rbx
0xffffffff80a113fd <+141>:	48 8b 0b	mov    (%rbx),%rcx
0xffffffff80a11400 <+144>:	0f b6 15 59 c0 0c 01	movzbl # 0x10cc059(%rip),%edx        # 0xffffffff81add460 <virtio_bus_poll_desc>
0xffffffff80a11407 <+151>:	48 8b 04 d1	mov    (%rcx,%rdx,8),%rax
0xffffffff80a1140b <+155>:	4c 39 38	cmp    %r15,(%rax)
0xffffffff80a1140e <+158>:	74 d0	je     0xffffffff80a113e0 <virtqueue_poll+112>

in other, C words:

void *
virtqueue_poll(struct virtqueue *vq, uint32_t *len)
{
	void *cookie;

	VIRTIO_BUS_POLL(vq->vq_dev);
	while ((cookie = virtqueue_dequeue(vq, len)) == NULL) {
		cpu_spinwait();
		VIRTIO_BUS_POLL(vq->vq_dev);
	}

	return (cookie);
}


(if *0x8(%rax) is an obfuscated call to virtqueue_dequeue() then…)

This is literally just the inner loop here.

That means: virtqueue_dequeue() never returns anything other than NULL.

Why, i don't know yet. There are several virtqueue_dequeue()s running on the system, according to dtrace, but none of them seem to be running on CPU1, like our virtqueue_poll().

my assumption here is

rand_harvest running CPU1 (true) → virtqueue_poll() running on CPU1 (true)
Therefore, our virtqueue_dequeue() should be running on CPU1 as well.

But either i'm not seeing it on CPU1, in fact it's running on all other CPUs exccept CPU1. Either that, or I don't know how to filter for it, but this doesn't work:

root@freebsd:~ # dtrace -n 'kinst::virtqueue_dequeue: /cpu == 1/'
dtrace: invalid probe specifier kinst::virtqueue_dequeue: /cpu == 1/: syntax error near end of input
Comment 6 Mina Galić freebsd_triage 2023-03-02 01:31:18 UTC
n.b.: virtio_random is the only call-site of virtqueue_poll() that passes something other than NULL to the *len parameter:

sys/dev/virtio/console/virtio_console.c
988:		virtqueue_poll(vq, NULL);
1382:		virtqueue_poll(vq, NULL);

sys/dev/virtio/random/virtio_random.c
288:	virtqueue_poll(vq, &rdlen);

sys/dev/virtio/block/virtio_blk.c
1442:	virtqueue_poll(vq, NULL);

sys/dev/virtio/network/if_vtnet.c
3526:		virtqueue_poll(vq, NULL);

tho i don't see how that would influence it.
Comment 7 Mina Galić freebsd_triage 2023-03-02 01:39:05 UTC
re DTrace stumbler: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=269906
if done "right", it shows no virtqueue_dequeue() on CPU1.
Comment 8 Mina Galić freebsd_triage 2023-03-03 23:21:39 UTC
see also this review: https://reviews.freebsd.org/D38898
Comment 9 Mina Galić freebsd_triage 2023-09-06 11:06:17 UTC
a commit towards fixing this issue has been. see
base f1c5a2e3a625053e2b70d5b1777d849a4d9328f2
review D41656

This just leaves the question of MFC.
hence, adding the author and committer of the above to this bug.
Comment 10 John Baldwin freebsd_committer freebsd_triage 2023-10-12 16:01:10 UTC
Forgot to tag this bug in the merges to stable/13 (c31927eb866f18b87cc196f14cbbcf658e10d38c) and stable/14 (b7f5e993476f6e445895383e73aed01741f8b9cf).

I might try to get this into 14.0 as well.
Comment 11 Mark Johnston freebsd_committer freebsd_triage 2023-10-13 19:07:07 UTC
(In reply to John Baldwin from comment #10)
I think a merge to 14.0 would be a good idea, given that this bug's been reported in several contexts.
Comment 12 John Baldwin freebsd_committer freebsd_triage 2023-10-16 23:06:07 UTC
Merged to 14.0 in commit c1e05a5e1a3e94633a050026e26211aaae59cabc.
Comment 13 Mina Galić freebsd_triage 2023-11-03 11:26:37 UTC
perhaps it's a good idea to MFC this to 13.x as well then
Comment 14 sparkblast 2023-12-16 19:25:17 UTC
Is this now in 13.x merged or will be merged?
Comment 15 Mina Galić freebsd_triage 2023-12-16 19:45:36 UTC
yeah, it was merged in base c31927eb866f18b87cc196f14cbbcf658e10d38c into stable/13 but I don't think we've got that released yet
Comment 16 sparkblast 2024-01-05 12:37:13 UTC
(In reply to Mina Galić from comment #15)
Thanks for your reply.
Can i see somewhere when its will be released on which version?

We are currently using opnsense whats based on freebsd 13.2 and we are affected on this bug.
Comment 17 Mark Johnston freebsd_committer freebsd_triage 2024-01-05 15:58:14 UTC
(In reply to sparkblast from comment #16)
The fix will appear in 13.3, which has not been officially announced but will likely be released within a couple of months.
Comment 18 Kyle Evans freebsd_committer freebsd_triage 2024-01-05 17:12:52 UTC
(In reply to Mark Johnston from comment #17)

The schedule's just recently been announced: https://www.freebsd.org/releases/13.3R/schedule/

Expected around March officially (so no contradiction)
Comment 19 Mark Johnston freebsd_committer freebsd_triage 2024-01-05 17:16:04 UTC
(In reply to Kyle Evans from comment #18)
Oh, hmm.  I was going off of https://www.freebsd.org/releng/, which looks like it needs an update.