Bug 193762 - [cc_cdg] crash after change net.inet.tcp.cc.cdg.smoothing_factor
Summary: [cc_cdg] crash after change net.inet.tcp.cc.cdg.smoothing_factor
Status: In Progress
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 10.1-STABLE
Hardware: amd64 Any
: --- Affects Only Me
Assignee: Michael Tuexen
URL:
Keywords: patch
Depends on:
Blocks:
 
Reported: 2014-09-19 03:05 UTC by iron.udjin
Modified: 2019-02-08 20:43 UTC (History)
5 users (show)

See Also:
tuexen: mfc-stable11?
tuexen: mfc-stable12?


Attachments
cc_cdg bug fix (796 bytes, patch)
2014-11-22 17:43 UTC, Midori Kato
no flags Details | Diff
packetdrill script for reproducing the issue (2.87 KB, text/plain)
2019-02-03 12:54 UTC, Michael Tuexen
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description iron.udjin 2014-09-19 03:05:09 UTC
Hello,

FreeBSD 10.1-BETA1 r271827.

1) # kldload cc_cdg
2) # sysctl net.inet.tcp.cc.algorithm=cdg
3) # sysctl net.inet.tcp.cc.cdg.smoothing_factor=0

After that system works 5-10 sec, freezes and restarts. I could reproduce it on my server and desktop.
Comment 1 Midori Kato 2014-11-18 23:43:32 UTC
Hi, I tried to reproduce your problem on my machine (OS version: 10.0-RELEASE), but nothing happens. Do you still encounter this problem?
Comment 2 iron.udjin 2014-11-19 20:01:19 UTC
Hello,

Thank you for reply. I can reproduce it on my desktop with 10.1-STABLE r274707 and server with 10.1-STABLE r274618. It bug appear when I run these commands and after that produce some network activity. For example, open browser or download something.
How can I debug it or catch any traceback before system restart?

Thanks
Comment 3 Midori Kato 2014-11-20 07:29:08 UTC
I could reproduce your problem and found why this problem occurs. You need to attach a patch to your kernel in order to fix it. Wait for a moment. I will generate it soon.
Comment 4 Midori Kato 2014-11-20 17:07:07 UTC
Hi,

I found you don't need a patch. The reasonable solution is set smoothing_factor to one instead of zero.
# sysctl net.inet.tcp.cc.cdg.smoothing_factor=1
This configuration helps connections to behave what you like exactly. Could you try this?
Comment 5 iron.udjin 2014-11-20 18:06:16 UTC
Yes, it helps. But according to CC_CDG(4): smoothing_factor - Number of samples used for moving average smoothing (0 means no smoothing).  Default is 8.

That's why I tried to set 0. If smoothing_factor=0 is abnormal setting for this tunable, documentation needs to be changed and restrict possibility to set smoothing_factor to 0. In any case, it's not normal behaviour when OS freezes and restarts, IMHO.

Thanks
Comment 6 Midori Kato 2014-11-20 18:11:12 UTC
This is completely an implementation issue. I am proposing to my fix to a BSD developer. What can I help you exactly?
Comment 7 iron.udjin 2014-11-20 18:47:03 UTC
This bug is not critical for me as we have a workaround. I guess it would be good to add some comments to CC CDG(4) to avoid other users face the same problem.

Thank you for help.
Comment 8 Midori Kato 2014-11-20 21:40:08 UTC
Np, thanks you for your bug report!
Comment 9 Midori Kato 2014-11-22 17:43:55 UTC
Created attachment 149717 [details]
cc_cdg bug fix
Comment 10 Midori Kato 2014-11-22 17:45:55 UTC
Hi again,

Could you test my attaching patch? If it works correctly, your finding bug fix merge into the main tree.
Comment 11 iron.udjin 2014-11-22 20:55:18 UTC
I just tested attached patch on my server. It still restarts when I set net.inet.tcp.cc.cdg.smoothing_factor=0
Comment 12 Midori Kato 2014-11-22 23:33:48 UTC
Could u check if ur revision is proper? If the revision is okay, show me the commands to rebuild ur kernel, please.
Comment 13 iron.udjin 2014-11-23 06:45:25 UTC
uname -a
FreeBSD dev 10.1-STABLE FreeBSD 10.1-STABLE #5 r274887M: Sat Nov 22 22:03:30 EET 2014     root@dev:/usr/obj/usr/src/sys/TG_DEV.debug  amd64

make -j20 buildworld NOCLEAN=YES && make -j20 buildkernel KERNCONF=TG_DEV.debug NOCLEAN=YES && make installworld && make installkernel KERNCONF=TG_DEV.debug
Comment 14 iron.udjin 2014-11-23 07:20:07 UTC
I just deveted everything in /usr/obj/ and buld world and kernel again. The same result. Server restarts.
Comment 15 Marcus von Appen freebsd_committer freebsd_triage 2015-02-18 11:54:21 UTC
Updated 10.1-BETA and 10.1-RC versioned bugs to 10.1-STABLE.
Comment 16 Michael Tuexen freebsd_committer 2019-02-03 12:54:20 UTC
Created attachment 201680 [details]
packetdrill script for reproducing the issue
Comment 17 Michael Tuexen freebsd_committer 2019-02-03 13:11:23 UTC
I added a potential fix in review D19071.
Comment 18 iron.udjin 2019-02-03 14:45:20 UTC
I just tested fix on my desktop PC. It works fine and doesn't restart. Also packetdrill script has successfully executed without PC restarts.

Please commit changes.
Thank you!
Comment 19 iron.udjin 2019-02-03 14:49:15 UTC
Forgot to mention: tested on 12.0-STABLE r343713M
Comment 20 Michael Tuexen freebsd_committer 2019-02-03 16:35:19 UTC
(In reply to iron.udjin from comment #18)
Thanks testing!

The fix needs to be MFCed to stable/11 and stable/12 after getting it into head.
Comment 21 commit-hook freebsd_committer 2019-02-08 20:43:05 UTC
A commit references this bug:

Author: tuexen
Date: Fri Feb  8 20:42:50 UTC 2019
New revision: 343920
URL: https://svnweb.freebsd.org/changeset/base/343920

Log:
  Ensure that when using the TCP CDG congestion control and setting the
  sysctl variable net.inet.tcp.cc.cdg.smoothing_factor to 0, the smoothing
  is disabled. Without this patch, a division by zero orrurs.

  PR:			193762
  Reviewed by:		lstewart@, rrs@
  MFC after:		3 days
  Sponsored by:		Netflix, Inc.
  Differential Revision:	https://reviews.freebsd.org/D19071

Changes:
  head/sys/netinet/cc/cc_cdg.c