Bug 6837 - in_setpeeraddr() and in_setsockaddr() block on memory
Summary: in_setpeeraddr() and in_setsockaddr() block on memory
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: Unspecified
Hardware: Any Any
: Normal Affects Only Me
Assignee: freebsd-bugs (Nobody)
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 1998-06-02 21:10 UTC by cmetz
Modified: 1998-06-06 10:35 UTC (History)
0 users

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description cmetz 1998-06-02 21:10:00 UTC
These two functions now MALLOC their address parameter inline rather
than having the address buffer passed in. They do so with M_WAITOK,
which will tsleep() the process indefinitely waiting for the memory.
Granted, if you're that short on memory on a BSD system, you'll have
bigger problems, but IMO these functions should kick ENOBUFS back up the
stack and get out of kernel mode (thus freeing up some other buffer
memory) rather than block the process.
Comment 1 David Greenman 1998-06-03 04:37:35 UTC
>These two functions now MALLOC their address parameter inline rather
>than having the address buffer passed in. They do so with M_WAITOK,
>which will tsleep() the process indefinitely waiting for the memory.
>Granted, if you're that short on memory on a BSD system, you'll have
>bigger problems, but IMO these functions should kick ENOBUFS back up the
>stack and get out of kernel mode (thus freeing up some other buffer
>memory) rather than block the process.

   Why do you think it should be that way? It won't be an indefinate wait,
just a wait until memory is freed up which shouldn't be for very long.

-DG

David Greenman
Co-founder/Principal Architect, The FreeBSD Project
Comment 2 cmetz 1998-06-03 10:37:39 UTC
In message <199806030337.UAA19868@implode.root.com>, you write:
>>These two functions now MALLOC their address parameter inline rather
>>than having the address buffer passed in. They do so with M_WAITOK,
>>which will tsleep() the process indefinitely waiting for the memory.
>>Granted, if you're that short on memory on a BSD system, you'll have
>>bigger problems, but IMO these functions should kick ENOBUFS back up the
>>stack and get out of kernel mode (thus freeing up some other buffer
>>memory) rather than block the process.
>
>   Why do you think it should be that way? It won't be an indefinate wait,
>just a wait until memory is freed up which shouldn't be for very long.

  I guess it's a bit of a philosophical thing.

  If you return ENOBUFS, the app is able to choose what it does about not
having enough memory to complete the operation. Most will quit, thus freeing up
the memory associated with them and helping the global out-of-memory problem
(even if kernel memory is a separate pool, there's almost always some kernel
memory associated with the process). Some will back out transactions, or return
a network failure, or sleep/spin for a while and then try again. But that way,
a well-written app has the choice.

  If you do it M_WAIT style, the kernel tsleep()s the process, and the app
doesn't have any choice. It sleeps, possibly for a long time. By the time it
wakes up, given many typical places where getpeername()/getsockname() are used,
the returned value will be irrelevant.

  Consider the case that so many people break^H^H^H^H^Hdesign their operating
systems for these days, the web. WWW servers typically call getpeername() and
getsockname() early in an incoming connection. If the call fails, the server
can close the connection, freeing up some kernel memory, and the client will
usually return an error. If the call sleeps, the connection will hang, and the
client will simply retry. If I'm not mistaken, unless it's doing asynchronous
I/O, the process won't be awakened by the connection close event, so it sits
there until memory becomes available, holding some itself. Meanwhile, the
client is firing off a new TCP connection and causing another getpeername()/
getsockname() call, which may be able to grab newly-freed memory before the
first process can wake up.

  The other, far easier to explain answer, is that we're putting some code in
there for IPv6 support that makes the malloc() happen at splnet(), and
tsleep()ing at such a priority is not good.

									-Craig
Comment 3 cmetz 1998-06-03 12:00:07 UTC
In message <199806031159.EAA22091@implode.root.com>, you write:
>>  The other, far easier to explain answer, is that we're putting some code in
>>there for IPv6 support that makes the malloc() happen at splnet(), and
>>tsleep()ing at such a priority is not good.
>
>   That would be bad, but I don't think the solution is to make it fail on
>temporary resource shortages. I think a better solution would be to change
>the functions to take an already (m)alloced struct sockaddr_in and change
>the callers (I think there are only two) to accomodate.

  In 4.4-Lite2, they did basically just that. Why did they change?

									-Craig
Comment 4 cmetz 1998-06-03 12:12:18 UTC
In message <199806031211.FAA22165@implode.root.com>, you write:
>>  In 4.4-Lite2, they did basically just that. Why did they change?
>
>   I believe this was a side effect of the elimination of using mbufs as
>containers for sockaddr data. I don't see a problem with changing the caller
>to malloc(), but perhaps Garrett might have a thought on this since he was
>the one to add the MALLOC there in the first place. Garrett?

  Perhaps it's because only the PF-specific setsockaddr/setpeeraddr function
knows the exact size to allocate. Moving the malloc up a level in the
heirarchy won't actually change the problem of blocking or not if there's not
enough memory for the buffer to hold this, though.

									-Craig
Comment 5 David Greenman 1998-06-03 12:59:40 UTC
>  If you do it M_WAIT style, the kernel tsleep()s the process, and the app
>doesn't have any choice. It sleeps, possibly for a long time. By the time it
>wakes up, given many typical places where getpeername()/getsockname() are used,
>the returned value will be irrelevant.

   In many cases the delay won't be for more than a few microseconds while
the pagedaemon wakes up and frees some cruft. In the worst case, some pages
need to be pushed to swap, but even that should be less than a second of
wait, so I don't see your argument about client retries. I really don't
think it is a good idea to make this non-blocking.

>  The other, far easier to explain answer, is that we're putting some code in
>there for IPv6 support that makes the malloc() happen at splnet(), and
>tsleep()ing at such a priority is not good.

   That would be bad, but I don't think the solution is to make it fail on
temporary resource shortages. I think a better solution would be to change
the functions to take an already (m)alloced struct sockaddr_in and change
the callers (I think there are only two) to accomodate.

-DG

David Greenman
Co-founder/Principal Architect, The FreeBSD Project
Comment 6 David Greenman 1998-06-03 13:11:02 UTC
>In message <199806031159.EAA22091@implode.root.com>, you write:
>>>  The other, far easier to explain answer, is that we're putting some code in
>>>there for IPv6 support that makes the malloc() happen at splnet(), and
>>>tsleep()ing at such a priority is not good.
>>
>>   That would be bad, but I don't think the solution is to make it fail on
>>temporary resource shortages. I think a better solution would be to change
>>the functions to take an already (m)alloced struct sockaddr_in and change
>>the callers (I think there are only two) to accomodate.
>
>  In 4.4-Lite2, they did basically just that. Why did they change?

   I believe this was a side effect of the elimination of using mbufs as
containers for sockaddr data. I don't see a problem with changing the caller
to malloc(), but perhaps Garrett might have a thought on this since he was
the one to add the MALLOC there in the first place. Garrett?

-DG

David Greenman
Co-founder/Principal Architect, The FreeBSD Project
Comment 7 David Greenman 1998-06-03 13:58:22 UTC
>In message <199806031211.FAA22165@implode.root.com>, you write:
>>>  In 4.4-Lite2, they did basically just that. Why did they change?
>>
>>   I believe this was a side effect of the elimination of using mbufs as
>>containers for sockaddr data. I don't see a problem with changing the caller
>>to malloc(), but perhaps Garrett might have a thought on this since he was
>>the one to add the MALLOC there in the first place. Garrett?
>
>  Perhaps it's because only the PF-specific setsockaddr/setpeeraddr function
>knows the exact size to allocate.

   There is a limit to how large the sockaddr can be, so the exact size
doesn't need to be known.

> Moving the malloc up a level in the
>heirarchy won't actually change the problem of blocking or not if there's not
>enough memory for the buffer to hold this, though.

   No, it won't change the problem of blocking for getpeername(), but the
code you're adding is to the pcb.c function, yes? So the blocking won't be
an issue.
   Why do you need to add code that is called at splnet to the in_set*addr
functions in the first place?

-DG

David Greenman
Co-founder/Principal Architect, The FreeBSD Project
Comment 8 Garrett A. Wollman 1998-06-03 16:16:13 UTC
<<On Wed, 03 Jun 1998 05:11:02 -0700, David Greenman <dg@root.com> said:

>    I believe this was a side effect of the elimination of using mbufs as
> containers for sockaddr data. I don't see a problem with changing the caller
> to malloc(), but perhaps Garrett might have a thought on this since he was
> the one to add the MALLOC there in the first place. Garrett?

Well, Bruce noted that doing all the copying slows things down
somewhat.  I was planning on reworking the interface Yet Again to make
the setsockaddr/setpeeraddr interface take a struct uio and then copy
the answer directly to its intended location.  (Probably do the same
thing for the other calls that pass out a sockaddr, and make a generic
interface that encapsulates the desired error behavior.)

-GAWollman

--
Garrett A. Wollman   | O Siem / We are all family / O Siem / We're all the same
wollman@lcs.mit.edu  | O Siem / The fires of freedom 
Opinions not those of| Dance in the burning flame
MIT, LCS, CRS, or NSA|                     - Susan Aglukark and Chad Irschick
Comment 9 Poul-Henning Kamp freebsd_committer freebsd_triage 1998-06-06 10:34:57 UTC
State Changed
From-To: open->closed

I think the issue is out-debated by now.