Bug 238707 - [PATCH][LOR] Lock order reversal: rtentry vs "nd6 list"
Summary: [PATCH][LOR] Lock order reversal: rtentry vs "nd6 list"
Status: New
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: CURRENT
Hardware: Any Any
: --- Affects Only Me
Assignee: Alexander V. Chernikov
URL:
Keywords: patch
Depends on:
Blocks:
 
Reported: 2019-06-19 15:05 UTC by Arnaud Ysmal
Modified: 2019-07-31 08:44 UTC (History)
1 user (show)

See Also:


Attachments
Patch to move the unlock of the rtentry just before the call of rt_notifydelete (663 bytes, patch)
2019-06-19 15:05 UTC, Arnaud Ysmal
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Arnaud Ysmal 2019-06-19 15:05:49 UTC
Created attachment 205221 [details]
Patch to move the unlock of the rtentry just before the call of rt_notifydelete

Hi,

I encounter the following lor in the function rtrequest1_fib (in the RTM_DELETE case of the switch).

When deleting a rtentry, it is returned locked by rt_unlinkrte then during the call to rt_notifydelete the "nd6 list" lock is locked.

In the RTM_ADD switch case, the old_rt in unlink by rt_unlinkrte then unlocked and passed to the rt_notifydelete, with this I assume there is no issue to call rt_notifydelete with a rtentry unlocked.

The patch enclosed move the unlock of the rtentry just before the call of rt_notifydelete.

The witness trace is the following:
[2019-05-01 23:21:12] lock order reversal:
[2019-05-01 23:21:12] 1st 0xfffff80036683998 rtentry (rtentry) @ net/route.c:1230
[2019-05-01 23:21:12] 2nd 0xffffffff8147fe68 nd6 list (nd6 list) @ netinet6/nd6_rtr.c:545
[2019-05-01 23:21:12] stack backtrace:
[2019-05-01 23:21:12] #0 0xffffffff80544e10 at witness_debugger+0x70
[2019-05-01 23:21:12] #1 0xffffffff80544d10 at witness_checkorder+0xc90
[2019-05-01 23:21:12] #2 0xffffffff804ecfda at __rw_rlock+0x3a
[2019-05-01 23:21:12] #3 0xffffffff8068f798 at defrouter_lookup+0x28
[2019-05-01 23:21:12] #4 0xffffffff80689b40 at nd6_rtrequest+0x50
[2019-05-01 23:21:12] #5 0xffffffff805f4a66 at rtrequest1_fib+0x286
[2019-05-01 23:21:12] #6 0xffffffff805f89ff at route_output+0x77f
[2019-05-01 23:21:12] #7 0xffffffff8056ef7f at sosend_generic+0x43f
[2019-05-01 23:21:12] #8 0xffffffff8054f90d at soo_write+0x2d
[2019-05-01 23:21:12] #9 0xffffffff80548b0a at dofilewrite+0x8a
[2019-05-01 23:21:12] #10 0xffffffff805487e8 at kern_writev+0x68
[2019-05-01 23:21:12] #11 0xffffffff80548776 at sys_write+0x86
[2019-05-01 23:21:12] #12 0xffffffff80753bf9 at amd64_syscall+0x299
[2019-05-01 23:21:12] #13 0xffffffff80734bed at fast_syscall_common+0x101
Comment 1 Alexander V. Chernikov freebsd_committer 2019-06-19 23:31:13 UTC
Thank you for reporting the LOR and providing the patch!

The fix might be slightly more complicated, as nothing stops another thread changing the gateway of the same rte to change rt->rt_ifa so we would call the handler on the new if a instead of the old one. I'm testing a slightly different patch and will hopefully commit it tomorrow.
Comment 2 Arnaud Ysmal 2019-07-31 08:44:24 UTC
Is it me or ifa_request is only used by nd6 and only for the delete case?

By "slightly different patch" do you mean something like, completely removing ifa_rtrequest from the struct ifaddr?