On the amd64 platform, if a 32-bit process ever manually set its rlimit, none of its 64-bit child or offspring will be able to get the full 64-bit rlimit anymore, even if they explicitly set the limit to unlimited. Note that for the sake of simplicity, only datasize limit is referred in this report. But the same logic applies to all other memory segment (i.e. stacksize, etc.). Take the following scenario as an example: 1) Let's say we have a 32-bit process p1 whose hard limit is set to 500MB by calling setrlimit(). 2) p1 then exec another 32-bit process p2. 3) p2 set its hard limit to unlimited by calling setrlimit(). 4) p2 exec a 64-bit process p3. 5) check the hard limit of p3, we can see that it only has 3GB (value of ia32_maxdsiz) instead of 32GB which is the global kernel limit (value of maxdsiz) for a 64-bit process. The root cause is that on step 3, p2 didn't actually set its limit to the correct value when calling setrlimit(). Instead the limit is set to ia32_maxdsiz since ia32_fixlimit() is called in kern_proc_setrlimit(). Fix: The proposed fix is to change kern_proc_setrlimit() so that sv_fixlimit() will not be called if the caller wants to set the new limit to RLIM_INFINITY. Please refer to the attached diff file for the proposed fix. Patch attached with submission follows: How-To-Repeat: There are 3 test programs attached in this report: 32_p1.c, 32_p2.c, and 64_p3.c. They can be used to reproduce the problem. 1) Compile 32_p1.c and 32_p2.c into 32-bit binaries. Compile 64_p3.c into 64-bit binary. 2) Put all 3 binaries into the same directory on a machine running FreeBSD amd64 version. 3) Run 32_p1 which will exec 32_p2 and 64_p3. The output of 64_p3 will show its limit is capped at ia32_maxdsiz.
The 'fix' is wrong and does not address the issue. Instead, it uses some arbitrary properties of the scenario you considered and adapts kernel code to suit your scenario. Your deny the correction of the infinity limit, I do not see how it can be right. The problem you described is architectural. By design, Unix resource limits cannot be increased after they were decreased, except by root. In your scenario, the limits were decreased by mere fact of running the 32bit process which have lower 'infinity' limits then 64bit processes. That said, I see two possible solutions. First is to manually set compat.ia32.max* sysctls to 0. Then you get desired behaviour for 64bit processes execed from 32bit, it seems. It does not require code change. Since you are fine with denying fix for infinity, this setting gives the same effect as the patch. Second approach (which is essentially a correction to your approach from fix.diff) is to track the fact that corresponding rlimits are set to 'ABI infinity', in some per-struct rlimit flag. Then, get/setrlimit should first test the 'ABI infinity' flag and behave as if rlimit is set to infinity for current bitness even if the actual value of rlimit is not infinity. Flag is set when rlimit is set to infinity by current ABI. The second approach would provide 'correct' fix, but it is not trivial amount of work for very rare situation (execing 64bit process from 32bit), and current behaviour of inheriting 32bit limits may be argued as right. If you want, feel free to develop such patch, I will review and commit it, but I do not want to spend efforts on developing it myself ATM.
Do not strip public lists from the discussion. There is nothing private. On Tue, Aug 07, 2012 at 05:52:07PM -0400, Ming Qiao wrote: > Hi Konstantin, > > Thanks for your quick response. Actually I'm not very clear about > the second approach you mentioned. Some questions here: 1) Could you > please elaborate the idea of "tracking rlimits set to ABI infinity"? > If I understand correctly, you are referring to a model where a > process can have it rlimit set multiple times by different ABI? But > what does it mean exactly? Could you give a simple example here? 2) > What do you mean by "per-struct rlimit"? Do you mean each memory > segment as a struct? such as datasize, stacksize, etc. I mean that in addition to the existing array of pl_rlimit in struct plimit, you also create an bitmap array of the same size. Set bit in this new array would indicate that corresponding limit was set (either implicit, or explicitely by usermode) to infinity. The bit has its meaning regardless of the actual numeric value written into the pl_rlimit, either by syscall or by sv_fixup. Then, 64bit sysent should also grow sv_fixup for resource limits, and set it accordingly for host ABI if array indicates that resource is logically 'infinite'. For completeness, I should note that bit is cleared if syscall sets the resource to non-infinite value. Per-struct rlimit means that there is a bit for each resource. Is it clear now ?
Thanks for the explanation. I'll prepare a fix and send it to you for revie= w when it's ready. ...Ming
For bugs matching the following criteria: Status: In Progress Changed: (is less than) 2014-06-01 Reset to default assignee and clear in-progress tags. Mail being skipped
Keyword: patch or patch-ready – in lieu of summary line prefix: [patch] * bulk change for the keyword * summary lines may be edited manually (not in bulk). Keyword descriptions and search interface: <https://bugs.freebsd.org/bugzilla/describekeywords.cgi>
^Triage: Update issue to reflect latest (2012) state. - Clarify summary to reflect problem as reported - Reporter aimed to submit an updated patch (needs-patch) - kib@ suggested two approaches (one a workaround, one a non trivial change for a minor case) Request feedback from (hopefully still available reporter) @Konstantin If your second non-trivial code change is unlikely to be accepted, we also have the option of closing Not Accepted based on what we have so far (a not accepted initial patch)