Bug 225330 - clang bug can incorrectly enable or disable interrupts on amd64
Summary: clang bug can incorrectly enable or disable interrupts on amd64
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: bin (show other bugs)
Version: CURRENT
Hardware: amd64 Any
: --- Affects Some People
Assignee: Dimitry Andric
URL:
Keywords:
Depends on: 227686 227698 227699
Blocks:
  Show dependency treegraph
 
Reported: 2018-01-20 01:03 UTC by Jonathan T. Looney
Modified: 2018-05-05 10:55 UTC (History)
7 users (show)

See Also:
dim: mfc-stable11+


Attachments
Patch to make clang do a slightly better job at restoring the EFLAGS register (2.97 KB, patch)
2018-01-20 01:03 UTC, Jonathan T. Looney
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Jonathan T. Looney freebsd_committer 2018-01-20 01:03:28 UTC
Created attachment 189922 [details]
Patch to make clang do a slightly better job at restoring the EFLAGS register

While making a change to kernel code, I recently discovered a compiler bug that can incorrectly enable or disable interrupts on amd64.

Imagine this code:

void
fx(void)
{
	_Bool needtocall;

	mtx_lock_spin(&lock);
	needtocall = (--count == 0);
	mtx_unlock_spin(&lock);
	if (needtocall)
		otherfx();
}


If you compile this code in clang (5.0.0, 5.0.1, or 6.0.0 all behave similarly in the critical respect) either in a kernel module or with mutex inlining disabled, you get this assembly (dumped with `objdump -S` to interleave the assembly and source code):

void
fx(void)
{
   0:	55                   	push   %rbp
   1:	48 89 e5             	mov    %rsp,%rbp
	_Bool needtocall;

	mtx_lock_spin(&lock);
   4:	53                   	push   %rbx
   5:	50                   	push   %rax
   6:	48 c7 c7 00 00 00 00 	mov    $0x0,%rdi
   d:	31 f6                	xor    %esi,%esi
   f:	48 c7 c2 00 00 00 00 	mov    $0x0,%rdx
  16:	b9 18 00 00 00       	mov    $0x18,%ecx
  1b:	e8 00 00 00 00       	callq  20 <fx+0x20>
	needtocall = (--count == 0);
  20:	ff 0c 25 00 00 00 00 	decl   0x0
  27:	9c                   	pushfq 
  28:	5b                   	pop    %rbx
	mtx_unlock_spin(&lock);
  29:	48 c7 c7 00 00 00 00 	mov    $0x0,%rdi
  30:	31 f6                	xor    %esi,%esi
  32:	48 c7 c2 00 00 00 00 	mov    $0x0,%rdx
  39:	b9 1a 00 00 00       	mov    $0x1a,%ecx
  3e:	e8 00 00 00 00       	callq  43 <fx+0x43>
	if (needtocall)
		otherfx();
  43:	48 83 c4 08          	add    $0x8,%rsp
  47:	53                   	push   %rbx
  48:	9d                   	popfq  
	_Bool needtocall;

	mtx_lock_spin(&lock);
	needtocall = (--count == 0);
	mtx_unlock_spin(&lock);
	if (needtocall)
  49:	74 03                	je     4e <fx+0x4e>
		otherfx();
}
  4b:	5b                   	pop    %rbx
  4c:	5d                   	pop    %rbp
  4d:	c3                   	retq   

	mtx_lock_spin(&lock);
	needtocall = (--count == 0);
	mtx_unlock_spin(&lock);
	if (needtocall)
		otherfx();
  4e:	5b                   	pop    %rbx
  4f:	5d                   	pop    %rbp
  50:	e9 00 00 00 00       	jmpq   55 <count+0x51>



The critical part is the way that clang handles the conditional:

1. Decrement the value. This will set ZF if count reaches 0:

  20:	ff 0c 25 00 00 00 00 	decl   0x0


2. Now, save the status flags (particularly, ZF) using pushfq. Critically, pushfq saves the entire EFLAGS value, which includes the interrupt flag. Because we are holding a spin mutex, interrupts are disabled at this point:

  27:	9c                   	pushfq 
  28:	5b                   	pop    %rbx


3. Now, call the function to unlock the spin mutex, which will potentially enable interrupts.


4. Now, restore the status flags. Critically, popfq restores the entire EFLAGS value, which includes the interrupt flag. Because interrupts were disabled when the flags were stored, these instructions will disable them again:

  43:	48 83 c4 08          	add    $0x8,%rsp
  47:	53                   	push   %rbx
  48:	9d                   	popfq  


5. Now, use the status flags to determine whether to call the function:

  49:	74 03                	je     4e <fx+0x4e>



After running this code, interrupts may remain erroneously disabled.

(This is a contrived minimal test case, but I did run into this bug while testing a code change in the kernel. Interrupts remained erroneously disabled after dropping a spin lock. And, in that case, the code was in the actual kernel -- not a module.)

By now, the bug should be obvious: clang shouldn't save and restore the entire EFLAGS register. It doesn't know what changes these other function calls make to the values of the EFLAGS register about which clang doesn't care (e.g. IF, IOPL, AC, etc.). By saving and restoring EFLAGS across function calls, it can load incorrect values for these other flags.

It looks like the correct solutions are to either not save EFLAGS across function calls, or to just save the individual flag(s) that need to be evaluated across the function call. (For example, a simple setz/test would probably be sufficient -- and more efficient -- in this case.) However, these are very significant changes that are complicated.

Two other things to note:

1. If clang knows that the target hardware supports lahf/sahf, it will use those instructions instead. That solves the problem, as it doesn't reset any flags other than status flags. (It still may not be as efficient as setz, but at least it is safe for kernel code.) lahf/sahf are supported on all i386 and *most* (but not all) amd64 CPUs. (Apparently, a few older amd64 CPUs didn't support the instruction.) Therefore, clang assumes lahf/sahf are supported on all i386 architectures, but assume it is not supported on amd64 unless the user specifies a target CPU class known to support it. This is a long way of saying that the default amd64 FreeBSD build assumes that it is not supported, leading to the buggy use of pushf/popf shown above.

2. In practice, this seems to very seldom show up in our kernel binaries. Even a trivial change to the above code (for example, "--count == 5" instead of "--count == 0") would make the compiler no longer save the EFLAGS register across the function call. I quickly audited the kernel and modules I compile and use at work, and found only one place that compiles with the pushf/popf instructions to save/restore the EFLAGS register across a function call. (Of course, this fragility can work in the reverse: a trivial change, such as the one I tried when I found this bug, can cause the compiler to use the pushf/popf instructions where it had not previously and, therefore, introduce a bug in interrupt handling.)


Finally, I have a patch that "fixes" (or, probably more accurately, "works around") the bug. It changes the pushf/popf so that the program will retrieve the latest EFLAGS register at restore time and only modify the status bits.

In C Pseudo-code, it does this:

save_eflags(register dest) {

    dest = load_eflags();
    dest &= (OF|SF|ZF|AF|PF|CF);
}

restore_eflags(register src) {
    register RAX;

    RAX = load_eflags();
    RAX &= ~(OF|SF|ZF|AF|PF|CF);
    RAX |= src;
    set_eflags(RAX);
}


The new assembly sequence looks like this:

Save EFLAGS:

  28:	9c                   	pushfq 
  29:	5b                   	pop    %rbx
  2a:	48 81 e3 d5 08 00 00 	and    $0x8d5,%rbx


Restore EFLAGS:

  4f:	9c                   	pushfq 
  50:	48 8b 04 24          	mov    (%rsp),%rax
  54:	48 25 2a f7 ff ff    	and    $0xfffffffffffff72a,%rax
  5a:	48 09 d8             	or     %rbx,%rax
  5d:	48 89 04 24          	mov    %rax,(%rsp)
  61:	9d                   	popfq  


This is far from ideal, but is better than what is there. And, the good news is that the compiler seems to actually save/restore EFLAGS very few places in our kernel code (hopefully, none which are in hot code paths). So, perhaps, this is good enough for now.

Yet another alternative would be to enable lahf/sahf by default, and provide a way for people to disable it if their CPU doesn't support those instructions (with an appropriate warning about how it may be buggy).
Comment 1 Dimitry Andric freebsd_committer 2018-01-20 11:39:43 UTC
It would be much better to take this issue upstream.  We would not want to carry patches like this without them having been checked over first.
Comment 2 Jonathan T. Looney freebsd_committer 2018-01-20 19:43:57 UTC
(In reply to Dimitry Andric from comment #1)
I filed an upstream bug report today: https://bugs.llvm.org/show_bug.cgi?id=36028.
Comment 3 Jonathan T. Looney freebsd_committer 2018-04-11 18:53:42 UTC
The upstream bug is fixed in r329657 and r329673. Also, dim@ fixed a related bug that allows us to use -msahf as a workaround (committed in upstream r325446). Once those revisions appear in our tree, this bug should be fixed.
Comment 4 commit-hook freebsd_committer 2018-04-14 12:07:24 UTC
A commit references this bug:

Author: dim
Date: Sat Apr 14 12:07:07 UTC 2018
New revision: 332501
URL: https://svnweb.freebsd.org/changeset/base/332501

Log:
  Pull in r325446 from upstream clang trunk (by me):

    [X86] Add 'sahf' CPU feature to frontend

    Summary:
    Make clang accept `-msahf` (and `-mno-sahf`) flags to activate the
    `+sahf` feature for the backend, for bug 36028 (Incorrect use of
    pushf/popf enables/disables interrupts on amd64 kernels).  This was
    originally submitted in bug 36037 by Jonathan Looney
    <jonlooney@gmail.com>.

    As described there, GCC also uses `-msahf` for this feature, and the
    backend already recognizes the `+sahf` feature. All that is needed is
    to teach clang to pass this on to the backend.

    The mapping of feature support onto CPUs may not be complete; rather,
    it was chosen to match LLVM's idea of which CPUs support this feature
    (see lib/Target/X86/X86.td).

    I also updated the affected test case (CodeGen/attr-target-x86.c) to
    match the emitted output.

    Reviewers: craig.topper, coby, efriedma, rsmith

    Reviewed By: craig.topper

    Subscribers: emaste, cfe-commits

    Differential Revision: https://reviews.llvm.org/D43394

  Pull in r328944 from upstream llvm trunk (by Chandler Carruth):

    [x86] Expose more of the condition conversion routines in the public
    API for X86's instruction information. I've now got a second patch
    under review that needs these same APIs. This bit is nicely
    orthogonal and obvious, so landing it. NFC.

  Pull in r329414 from upstream llvm trunk (by Craig Topper):

    [X86] Merge itineraries for CLC, CMC, and STC.

    These are very simple flag setting instructions that appear to only
    be a single uop. They're unlikely to need this separation.

  Pull in r329657 from upstream llvm trunk (by Chandler Carruth):

    [x86] Introduce a pass to begin more systematically fixing PR36028
    and similar issues.

    The key idea is to lower COPY nodes populating EFLAGS by scanning the
    uses of EFLAGS and introducing dedicated code to preserve the
    necessary state in a GPR. In the vast majority of cases, these uses
    are cmovCC and jCC instructions. For such cases, we can very easily
    save and restore the necessary information by simply inserting a
    setCC into a GPR where the original flags are live, and then testing
    that GPR directly to feed the cmov or conditional branch.

    However, things are a bit more tricky if arithmetic is using the
    flags.  This patch handles the vast majority of cases that seem to
    come up in practice: adc, adcx, adox, rcl, and rcr; all without
    taking advantage of partially preserved EFLAGS as LLVM doesn't
    currently model that at all.

    There are a large number of operations that techinaclly observe
    EFLAGS currently but shouldn't in this case -- they typically are
    using DF.  Currently, they will not be handled by this approach.
    However, I have never seen this issue come up in practice. It is
    already pretty rare to have these patterns come up in practical code
    with LLVM. I had to resort to writing MIR tests to cover most of the
    logic in this pass already.  I suspect even with its current amount
    of coverage of arithmetic users of EFLAGS it will be a significant
    improvement over the current use of pushf/popf. It will also produce
    substantially faster code in most of the common patterns.

    This patch also removes all of the old lowering for EFLAGS copies,
    and the hack that forced us to use a frame pointer when EFLAGS copies
    were found anywhere in a function so that the dynamic stack
    adjustment wasn't a problem. None of this is needed as we now lower
    all of these copies directly in MI and without require stack
    adjustments.

    Lots of thanks to Reid who came up with several aspects of this
    approach, and Craig who helped me work out a couple of things
    tripping me up while working on this.

    Differential Revision: https://reviews.llvm.org/D45146

  Pull in r329673 from upstream llvm trunk (by Chandler Carruth):

    [x86] Model the direction flag (DF) separately from the rest of
    EFLAGS.

    This cleans up a number of operations that only claimed te use EFLAGS
    due to using DF. But no instructions which we think of us setting
    EFLAGS actually modify DF (other than things like popf) and so this
    needlessly creates uses of EFLAGS that aren't really there.

    In fact, DF is so restrictive it is pretty easy to model. Only STD,
    CLD, and the whole-flags writes (WRFLAGS and POPF) need to model
    this.

    I've also somewhat cleaned up some of the flag management instruction
    definitions to be in the correct .td file.

    Adding this extra register also uncovered a failure to use the
    correct datatype to hold X86 registers, and I've corrected that as
    necessary here.

    Differential Revision: https://reviews.llvm.org/D45154

  Together, these should ensure clang does not use pushf/popf sequences to
  save and restore flags, avoiding problems with unrelated flags (such as
  the interrupt flag) being restored unexpectedly.

  Requested by:	jtl
  PR:		225330
  MFC after:	1 week

Changes:
  head/contrib/llvm/include/llvm/CodeGen/MachineBasicBlock.h
  head/contrib/llvm/lib/CodeGen/MachineBasicBlock.cpp
  head/contrib/llvm/lib/Target/X86/Disassembler/X86Disassembler.cpp
  head/contrib/llvm/lib/Target/X86/X86.h
  head/contrib/llvm/lib/Target/X86/X86FlagsCopyLowering.cpp
  head/contrib/llvm/lib/Target/X86/X86ISelLowering.cpp
  head/contrib/llvm/lib/Target/X86/X86ISelLowering.h
  head/contrib/llvm/lib/Target/X86/X86InstrCompiler.td
  head/contrib/llvm/lib/Target/X86/X86InstrInfo.cpp
  head/contrib/llvm/lib/Target/X86/X86InstrInfo.h
  head/contrib/llvm/lib/Target/X86/X86InstrInfo.td
  head/contrib/llvm/lib/Target/X86/X86InstrSystem.td
  head/contrib/llvm/lib/Target/X86/X86RegisterInfo.td
  head/contrib/llvm/lib/Target/X86/X86Schedule.td
  head/contrib/llvm/lib/Target/X86/X86ScheduleAtom.td
  head/contrib/llvm/lib/Target/X86/X86TargetMachine.cpp
  head/contrib/llvm/tools/clang/include/clang/Driver/Options.td
  head/contrib/llvm/tools/clang/lib/Basic/Targets/X86.cpp
  head/contrib/llvm/tools/clang/lib/Basic/Targets/X86.h
  head/lib/clang/freebsd_cc_version.h
  head/lib/clang/libllvm/Makefile
Comment 5 commit-hook freebsd_committer 2018-04-14 14:57:42 UTC
A commit references this bug:

Author: dim
Date: Sat Apr 14 14:57:34 UTC 2018
New revision: 332503
URL: https://svnweb.freebsd.org/changeset/base/332503

Log:
  Revert r332501 for now, as it can cause build failures on i386.
  Reported upstream as <https://bugs.llvm.org/show_bug.cgi?id=37133>.

  Reported by:	emaste, ci.freebsd.org
  PR:		225330

Changes:
  head/contrib/llvm/include/llvm/CodeGen/MachineBasicBlock.h
  head/contrib/llvm/lib/CodeGen/MachineBasicBlock.cpp
  head/contrib/llvm/lib/Target/X86/Disassembler/X86Disassembler.cpp
  head/contrib/llvm/lib/Target/X86/X86.h
  head/contrib/llvm/lib/Target/X86/X86FlagsCopyLowering.cpp
  head/contrib/llvm/lib/Target/X86/X86ISelLowering.cpp
  head/contrib/llvm/lib/Target/X86/X86ISelLowering.h
  head/contrib/llvm/lib/Target/X86/X86InstrCompiler.td
  head/contrib/llvm/lib/Target/X86/X86InstrInfo.cpp
  head/contrib/llvm/lib/Target/X86/X86InstrInfo.h
  head/contrib/llvm/lib/Target/X86/X86InstrInfo.td
  head/contrib/llvm/lib/Target/X86/X86InstrSystem.td
  head/contrib/llvm/lib/Target/X86/X86RegisterInfo.td
  head/contrib/llvm/lib/Target/X86/X86Schedule.td
  head/contrib/llvm/lib/Target/X86/X86ScheduleAtom.td
  head/contrib/llvm/lib/Target/X86/X86TargetMachine.cpp
  head/contrib/llvm/tools/clang/include/clang/Driver/Options.td
  head/contrib/llvm/tools/clang/lib/Basic/Targets/X86.cpp
  head/contrib/llvm/tools/clang/lib/Basic/Targets/X86.h
  head/lib/clang/freebsd_cc_version.h
  head/lib/clang/libllvm/Makefile
Comment 6 commit-hook freebsd_committer 2018-04-20 18:21:57 UTC
A commit references this bug:

Author: dim
Date: Fri Apr 20 18:20:56 UTC 2018
New revision: 332833
URL: https://svnweb.freebsd.org/changeset/base/332833

Log:
  Recommit r332501, with an additional upstream fix for "Cannot lower
  EFLAGS copy that lives out of a basic block!" errors on i386.

  Pull in r325446 from upstream clang trunk (by me):

    [X86] Add 'sahf' CPU feature to frontend

    Summary:
    Make clang accept `-msahf` (and `-mno-sahf`) flags to activate the
    `+sahf` feature for the backend, for bug 36028 (Incorrect use of
    pushf/popf enables/disables interrupts on amd64 kernels).  This was
    originally submitted in bug 36037 by Jonathan Looney
    <jonlooney@gmail.com>.

    As described there, GCC also uses `-msahf` for this feature, and the
    backend already recognizes the `+sahf` feature. All that is needed is
    to teach clang to pass this on to the backend.

    The mapping of feature support onto CPUs may not be complete; rather,
    it was chosen to match LLVM's idea of which CPUs support this feature
    (see lib/Target/X86/X86.td).

    I also updated the affected test case (CodeGen/attr-target-x86.c) to
    match the emitted output.

    Reviewers: craig.topper, coby, efriedma, rsmith

    Reviewed By: craig.topper

    Subscribers: emaste, cfe-commits

    Differential Revision: https://reviews.llvm.org/D43394

  Pull in r328944 from upstream llvm trunk (by Chandler Carruth):

    [x86] Expose more of the condition conversion routines in the public
    API for X86's instruction information. I've now got a second patch
    under review that needs these same APIs. This bit is nicely
    orthogonal and obvious, so landing it. NFC.

  Pull in r329414 from upstream llvm trunk (by Craig Topper):

    [X86] Merge itineraries for CLC, CMC, and STC.

    These are very simple flag setting instructions that appear to only
    be a single uop. They're unlikely to need this separation.

  Pull in r329657 from upstream llvm trunk (by Chandler Carruth):

    [x86] Introduce a pass to begin more systematically fixing PR36028
    and similar issues.

    The key idea is to lower COPY nodes populating EFLAGS by scanning the
    uses of EFLAGS and introducing dedicated code to preserve the
    necessary state in a GPR. In the vast majority of cases, these uses
    are cmovCC and jCC instructions. For such cases, we can very easily
    save and restore the necessary information by simply inserting a
    setCC into a GPR where the original flags are live, and then testing
    that GPR directly to feed the cmov or conditional branch.

    However, things are a bit more tricky if arithmetic is using the
    flags.  This patch handles the vast majority of cases that seem to
    come up in practice: adc, adcx, adox, rcl, and rcr; all without
    taking advantage of partially preserved EFLAGS as LLVM doesn't
    currently model that at all.

    There are a large number of operations that techinaclly observe
    EFLAGS currently but shouldn't in this case -- they typically are
    using DF.  Currently, they will not be handled by this approach.
    However, I have never seen this issue come up in practice. It is
    already pretty rare to have these patterns come up in practical code
    with LLVM. I had to resort to writing MIR tests to cover most of the
    logic in this pass already.  I suspect even with its current amount
    of coverage of arithmetic users of EFLAGS it will be a significant
    improvement over the current use of pushf/popf. It will also produce
    substantially faster code in most of the common patterns.

    This patch also removes all of the old lowering for EFLAGS copies,
    and the hack that forced us to use a frame pointer when EFLAGS copies
    were found anywhere in a function so that the dynamic stack
    adjustment wasn't a problem. None of this is needed as we now lower
    all of these copies directly in MI and without require stack
    adjustments.

    Lots of thanks to Reid who came up with several aspects of this
    approach, and Craig who helped me work out a couple of things
    tripping me up while working on this.

    Differential Revision: https://reviews.llvm.org/D45146

  Pull in r329673 from upstream llvm trunk (by Chandler Carruth):

    [x86] Model the direction flag (DF) separately from the rest of
    EFLAGS.

    This cleans up a number of operations that only claimed te use EFLAGS
    due to using DF. But no instructions which we think of us setting
    EFLAGS actually modify DF (other than things like popf) and so this
    needlessly creates uses of EFLAGS that aren't really there.

    In fact, DF is so restrictive it is pretty easy to model. Only STD,
    CLD, and the whole-flags writes (WRFLAGS and POPF) need to model
    this.

    I've also somewhat cleaned up some of the flag management instruction
    definitions to be in the correct .td file.

    Adding this extra register also uncovered a failure to use the
    correct datatype to hold X86 registers, and I've corrected that as
    necessary here.

    Differential Revision: https://reviews.llvm.org/D45154

  Pull in r330264 from upstream llvm trunk (by Chandler Carruth):

    [x86] Fix PR37100 by teaching the EFLAGS copy lowering to rewrite
    uses across basic blocks in the limited cases where it is very
    straight forward to do so.

    This will also be useful for other places where we do some limited
    EFLAGS propagation across CFG edges and need to handle copy rewrites
    afterward. I think this is rapidly approaching the maximum we can and
    should be doing here. Everything else begins to require either heroic
    analysis to prove how to do PHI insertion manually, or somehow
    managing arbitrary PHI-ing of EFLAGS with general PHI insertion.
    Neither of these seem at all promising so if those cases come up,
    we'll almost certainly need to rewrite the parts of LLVM that produce
    those patterns.

    We do now require dominator trees in order to reliably diagnose
    patterns that would require PHI nodes. This is a bit unfortunate but
    it seems better than the completely mysterious crash we would get
    otherwise.

    Differential Revision: https://reviews.llvm.org/D45673

  Together, these should ensure clang does not use pushf/popf sequences to
  save and restore flags, avoiding problems with unrelated flags (such as
  the interrupt flag) being restored unexpectedly.

  Requested by:	jtl
  PR:		225330
  MFC after:	1 week

Changes:
  head/contrib/llvm/include/llvm/CodeGen/MachineBasicBlock.h
  head/contrib/llvm/lib/CodeGen/MachineBasicBlock.cpp
  head/contrib/llvm/lib/Target/X86/Disassembler/X86Disassembler.cpp
  head/contrib/llvm/lib/Target/X86/X86.h
  head/contrib/llvm/lib/Target/X86/X86FlagsCopyLowering.cpp
  head/contrib/llvm/lib/Target/X86/X86ISelLowering.cpp
  head/contrib/llvm/lib/Target/X86/X86ISelLowering.h
  head/contrib/llvm/lib/Target/X86/X86InstrCompiler.td
  head/contrib/llvm/lib/Target/X86/X86InstrInfo.cpp
  head/contrib/llvm/lib/Target/X86/X86InstrInfo.h
  head/contrib/llvm/lib/Target/X86/X86InstrInfo.td
  head/contrib/llvm/lib/Target/X86/X86InstrSystem.td
  head/contrib/llvm/lib/Target/X86/X86RegisterInfo.td
  head/contrib/llvm/lib/Target/X86/X86Schedule.td
  head/contrib/llvm/lib/Target/X86/X86ScheduleAtom.td
  head/contrib/llvm/lib/Target/X86/X86TargetMachine.cpp
  head/contrib/llvm/tools/clang/include/clang/Driver/Options.td
  head/contrib/llvm/tools/clang/lib/Basic/Targets/X86.cpp
  head/contrib/llvm/tools/clang/lib/Basic/Targets/X86.h
  head/lib/clang/freebsd_cc_version.h
  head/lib/clang/libllvm/Makefile
Comment 7 commit-hook freebsd_committer 2018-04-20 22:46:36 UTC
A commit references this bug:

Author: brooks
Date: Fri Apr 20 22:46:23 UTC 2018
New revision: 467849
URL: https://svnweb.freebsd.org/changeset/ports/467849

Log:
  Merge r332833 from FreeBSD HEAD.

  This should ensure clang does not use pushf/popf sequences to
  save and restore flags, avoiding problems with unrelated flags (such as
  the interrupt flag) being restored unexpectedly.

  PR:		225330

Changes:
  head/devel/llvm60/Makefile
  head/devel/llvm60/files/clang/patch-fsvn-r332833-clang
  head/devel/llvm60/files/patch-fsvn-r332833
Comment 8 commit-hook freebsd_committer 2018-04-23 23:09:05 UTC
A commit references this bug:

Author: dim
Date: Mon Apr 23 23:07:58 UTC 2018
New revision: 332898
URL: https://svnweb.freebsd.org/changeset/base/332898

Log:
  Pull in r329771 from upstream llvm trunk (by Craig Topper):

    [X86] In X86FlagsCopyLowering, when rewriting a memory setcc we need
    to emit an explicit MOV8mr instruction.

    Previously the code only knew how to handle setcc to a register.

    This should fix a crash in the chromium build.

  This fixes various assertion failures while building ports targeting
  i386:
  * www/firefox: isReg() && "This is not a register operand!"
  * www/iridium, www/qt5-webengine: (I.atEnd() || std::next(I) ==
    def_instr_end()) && "getVRegDef assumes a single definition or no
    definition"
  * devel/powerpc64-gcc: FromReg != ToReg && "Cannot replace a reg with
    itself"

  Reported by:	jbeich
  PR:		225330, 227686, 227698, 227699
  MFC after:	1 week
  X-MFC-With:	r332833

Changes:
  head/contrib/llvm/lib/Target/X86/X86FlagsCopyLowering.cpp
Comment 9 commit-hook freebsd_committer 2018-04-27 19:22:19 UTC
A commit references this bug:

Author: dim
Date: Fri Apr 27 19:21:41 UTC 2018
New revision: 333070
URL: https://svnweb.freebsd.org/changeset/base/333070

Log:
  MFC r332833:

  Recommit r332501, with an additional upstream fix for "Cannot lower
  EFLAGS copy that lives out of a basic block!" errors on i386.

  Pull in r325446 from upstream clang trunk (by me):

    [X86] Add 'sahf' CPU feature to frontend

    Summary:
    Make clang accept `-msahf` (and `-mno-sahf`) flags to activate the
    `+sahf` feature for the backend, for bug 36028 (Incorrect use of
    pushf/popf enables/disables interrupts on amd64 kernels).  This was
    originally submitted in bug 36037 by Jonathan Looney
    <jonlooney@gmail.com>.

    As described there, GCC also uses `-msahf` for this feature, and the
    backend already recognizes the `+sahf` feature. All that is needed is
    to teach clang to pass this on to the backend.

    The mapping of feature support onto CPUs may not be complete; rather,
    it was chosen to match LLVM's idea of which CPUs support this feature
    (see lib/Target/X86/X86.td).

    I also updated the affected test case (CodeGen/attr-target-x86.c) to
    match the emitted output.

    Reviewers: craig.topper, coby, efriedma, rsmith

    Reviewed By: craig.topper

    Subscribers: emaste, cfe-commits

    Differential Revision: https://reviews.llvm.org/D43394

  Pull in r328944 from upstream llvm trunk (by Chandler Carruth):

    [x86] Expose more of the condition conversion routines in the public
    API for X86's instruction information. I've now got a second patch
    under review that needs these same APIs. This bit is nicely
    orthogonal and obvious, so landing it. NFC.

  Pull in r329414 from upstream llvm trunk (by Craig Topper):

    [X86] Merge itineraries for CLC, CMC, and STC.

    These are very simple flag setting instructions that appear to only
    be a single uop. They're unlikely to need this separation.

  Pull in r329657 from upstream llvm trunk (by Chandler Carruth):

    [x86] Introduce a pass to begin more systematically fixing PR36028
    and similar issues.

    The key idea is to lower COPY nodes populating EFLAGS by scanning the
    uses of EFLAGS and introducing dedicated code to preserve the
    necessary state in a GPR. In the vast majority of cases, these uses
    are cmovCC and jCC instructions. For such cases, we can very easily
    save and restore the necessary information by simply inserting a
    setCC into a GPR where the original flags are live, and then testing
    that GPR directly to feed the cmov or conditional branch.

    However, things are a bit more tricky if arithmetic is using the
    flags.  This patch handles the vast majority of cases that seem to
    come up in practice: adc, adcx, adox, rcl, and rcr; all without
    taking advantage of partially preserved EFLAGS as LLVM doesn't
    currently model that at all.

    There are a large number of operations that techinaclly observe
    EFLAGS currently but shouldn't in this case -- they typically are
    using DF.  Currently, they will not be handled by this approach.
    However, I have never seen this issue come up in practice. It is
    already pretty rare to have these patterns come up in practical code
    with LLVM. I had to resort to writing MIR tests to cover most of the
    logic in this pass already.  I suspect even with its current amount
    of coverage of arithmetic users of EFLAGS it will be a significant
    improvement over the current use of pushf/popf. It will also produce
    substantially faster code in most of the common patterns.

    This patch also removes all of the old lowering for EFLAGS copies,
    and the hack that forced us to use a frame pointer when EFLAGS copies
    were found anywhere in a function so that the dynamic stack
    adjustment wasn't a problem. None of this is needed as we now lower
    all of these copies directly in MI and without require stack
    adjustments.

    Lots of thanks to Reid who came up with several aspects of this
    approach, and Craig who helped me work out a couple of things
    tripping me up while working on this.

    Differential Revision: https://reviews.llvm.org/D45146

  Pull in r329673 from upstream llvm trunk (by Chandler Carruth):

    [x86] Model the direction flag (DF) separately from the rest of
    EFLAGS.

    This cleans up a number of operations that only claimed te use EFLAGS
    due to using DF. But no instructions which we think of us setting
    EFLAGS actually modify DF (other than things like popf) and so this
    needlessly creates uses of EFLAGS that aren't really there.

    In fact, DF is so restrictive it is pretty easy to model. Only STD,
    CLD, and the whole-flags writes (WRFLAGS and POPF) need to model
    this.

    I've also somewhat cleaned up some of the flag management instruction
    definitions to be in the correct .td file.

    Adding this extra register also uncovered a failure to use the
    correct datatype to hold X86 registers, and I've corrected that as
    necessary here.

    Differential Revision: https://reviews.llvm.org/D45154

  Pull in r330264 from upstream llvm trunk (by Chandler Carruth):

    [x86] Fix PR37100 by teaching the EFLAGS copy lowering to rewrite
    uses across basic blocks in the limited cases where it is very
    straight forward to do so.

    This will also be useful for other places where we do some limited
    EFLAGS propagation across CFG edges and need to handle copy rewrites
    afterward. I think this is rapidly approaching the maximum we can and
    should be doing here. Everything else begins to require either heroic
    analysis to prove how to do PHI insertion manually, or somehow
    managing arbitrary PHI-ing of EFLAGS with general PHI insertion.
    Neither of these seem at all promising so if those cases come up,
    we'll almost certainly need to rewrite the parts of LLVM that produce
    those patterns.

    We do now require dominator trees in order to reliably diagnose
    patterns that would require PHI nodes. This is a bit unfortunate but
    it seems better than the completely mysterious crash we would get
    otherwise.

    Differential Revision: https://reviews.llvm.org/D45673

  Together, these should ensure clang does not use pushf/popf sequences to
  save and restore flags, avoiding problems with unrelated flags (such as
  the interrupt flag) being restored unexpectedly.

  Requested by:	jtl
  PR:		225330

  MFC r332898:

  Pull in r329771 from upstream llvm trunk (by Craig Topper):

    [X86] In X86FlagsCopyLowering, when rewriting a memory setcc we need
    to emit an explicit MOV8mr instruction.

    Previously the code only knew how to handle setcc to a register.

    This should fix a crash in the chromium build.

  This fixes various assertion failures while building ports targeting
  i386:
  * www/firefox: isReg() && "This is not a register operand!"
  * www/iridium, www/qt5-webengine: (I.atEnd() || std::next(I) ==
    def_instr_end()) && "getVRegDef assumes a single definition or no
    definition"
  * devel/powerpc64-gcc: FromReg != ToReg && "Cannot replace a reg with
    itself"

  Reported by:	jbeich
  PR:		225330, 227686, 227698, 227699

Changes:
_U  stable/11/
  stable/11/contrib/llvm/include/llvm/CodeGen/MachineBasicBlock.h
  stable/11/contrib/llvm/lib/CodeGen/MachineBasicBlock.cpp
  stable/11/contrib/llvm/lib/Target/X86/Disassembler/X86Disassembler.cpp
  stable/11/contrib/llvm/lib/Target/X86/X86.h
  stable/11/contrib/llvm/lib/Target/X86/X86FlagsCopyLowering.cpp
  stable/11/contrib/llvm/lib/Target/X86/X86ISelLowering.cpp
  stable/11/contrib/llvm/lib/Target/X86/X86ISelLowering.h
  stable/11/contrib/llvm/lib/Target/X86/X86InstrCompiler.td
  stable/11/contrib/llvm/lib/Target/X86/X86InstrInfo.cpp
  stable/11/contrib/llvm/lib/Target/X86/X86InstrInfo.h
  stable/11/contrib/llvm/lib/Target/X86/X86InstrInfo.td
  stable/11/contrib/llvm/lib/Target/X86/X86InstrSystem.td
  stable/11/contrib/llvm/lib/Target/X86/X86RegisterInfo.td
  stable/11/contrib/llvm/lib/Target/X86/X86Schedule.td
  stable/11/contrib/llvm/lib/Target/X86/X86ScheduleAtom.td
  stable/11/contrib/llvm/lib/Target/X86/X86TargetMachine.cpp
  stable/11/contrib/llvm/tools/clang/include/clang/Driver/Options.td
  stable/11/contrib/llvm/tools/clang/lib/Basic/Targets/X86.cpp
  stable/11/contrib/llvm/tools/clang/lib/Basic/Targets/X86.h
  stable/11/lib/clang/freebsd_cc_version.h
  stable/11/lib/clang/libllvm/Makefile
Comment 10 commit-hook freebsd_committer 2018-04-27 19:22:24 UTC
A commit references this bug:

Author: dim
Date: Fri Apr 27 19:21:41 UTC 2018
New revision: 333070
URL: https://svnweb.freebsd.org/changeset/base/333070

Log:
  MFC r332833:

  Recommit r332501, with an additional upstream fix for "Cannot lower
  EFLAGS copy that lives out of a basic block!" errors on i386.

  Pull in r325446 from upstream clang trunk (by me):

    [X86] Add 'sahf' CPU feature to frontend

    Summary:
    Make clang accept `-msahf` (and `-mno-sahf`) flags to activate the
    `+sahf` feature for the backend, for bug 36028 (Incorrect use of
    pushf/popf enables/disables interrupts on amd64 kernels).  This was
    originally submitted in bug 36037 by Jonathan Looney
    <jonlooney@gmail.com>.

    As described there, GCC also uses `-msahf` for this feature, and the
    backend already recognizes the `+sahf` feature. All that is needed is
    to teach clang to pass this on to the backend.

    The mapping of feature support onto CPUs may not be complete; rather,
    it was chosen to match LLVM's idea of which CPUs support this feature
    (see lib/Target/X86/X86.td).

    I also updated the affected test case (CodeGen/attr-target-x86.c) to
    match the emitted output.

    Reviewers: craig.topper, coby, efriedma, rsmith

    Reviewed By: craig.topper

    Subscribers: emaste, cfe-commits

    Differential Revision: https://reviews.llvm.org/D43394

  Pull in r328944 from upstream llvm trunk (by Chandler Carruth):

    [x86] Expose more of the condition conversion routines in the public
    API for X86's instruction information. I've now got a second patch
    under review that needs these same APIs. This bit is nicely
    orthogonal and obvious, so landing it. NFC.

  Pull in r329414 from upstream llvm trunk (by Craig Topper):

    [X86] Merge itineraries for CLC, CMC, and STC.

    These are very simple flag setting instructions that appear to only
    be a single uop. They're unlikely to need this separation.

  Pull in r329657 from upstream llvm trunk (by Chandler Carruth):

    [x86] Introduce a pass to begin more systematically fixing PR36028
    and similar issues.

    The key idea is to lower COPY nodes populating EFLAGS by scanning the
    uses of EFLAGS and introducing dedicated code to preserve the
    necessary state in a GPR. In the vast majority of cases, these uses
    are cmovCC and jCC instructions. For such cases, we can very easily
    save and restore the necessary information by simply inserting a
    setCC into a GPR where the original flags are live, and then testing
    that GPR directly to feed the cmov or conditional branch.

    However, things are a bit more tricky if arithmetic is using the
    flags.  This patch handles the vast majority of cases that seem to
    come up in practice: adc, adcx, adox, rcl, and rcr; all without
    taking advantage of partially preserved EFLAGS as LLVM doesn't
    currently model that at all.

    There are a large number of operations that techinaclly observe
    EFLAGS currently but shouldn't in this case -- they typically are
    using DF.  Currently, they will not be handled by this approach.
    However, I have never seen this issue come up in practice. It is
    already pretty rare to have these patterns come up in practical code
    with LLVM. I had to resort to writing MIR tests to cover most of the
    logic in this pass already.  I suspect even with its current amount
    of coverage of arithmetic users of EFLAGS it will be a significant
    improvement over the current use of pushf/popf. It will also produce
    substantially faster code in most of the common patterns.

    This patch also removes all of the old lowering for EFLAGS copies,
    and the hack that forced us to use a frame pointer when EFLAGS copies
    were found anywhere in a function so that the dynamic stack
    adjustment wasn't a problem. None of this is needed as we now lower
    all of these copies directly in MI and without require stack
    adjustments.

    Lots of thanks to Reid who came up with several aspects of this
    approach, and Craig who helped me work out a couple of things
    tripping me up while working on this.

    Differential Revision: https://reviews.llvm.org/D45146

  Pull in r329673 from upstream llvm trunk (by Chandler Carruth):

    [x86] Model the direction flag (DF) separately from the rest of
    EFLAGS.

    This cleans up a number of operations that only claimed te use EFLAGS
    due to using DF. But no instructions which we think of us setting
    EFLAGS actually modify DF (other than things like popf) and so this
    needlessly creates uses of EFLAGS that aren't really there.

    In fact, DF is so restrictive it is pretty easy to model. Only STD,
    CLD, and the whole-flags writes (WRFLAGS and POPF) need to model
    this.

    I've also somewhat cleaned up some of the flag management instruction
    definitions to be in the correct .td file.

    Adding this extra register also uncovered a failure to use the
    correct datatype to hold X86 registers, and I've corrected that as
    necessary here.

    Differential Revision: https://reviews.llvm.org/D45154

  Pull in r330264 from upstream llvm trunk (by Chandler Carruth):

    [x86] Fix PR37100 by teaching the EFLAGS copy lowering to rewrite
    uses across basic blocks in the limited cases where it is very
    straight forward to do so.

    This will also be useful for other places where we do some limited
    EFLAGS propagation across CFG edges and need to handle copy rewrites
    afterward. I think this is rapidly approaching the maximum we can and
    should be doing here. Everything else begins to require either heroic
    analysis to prove how to do PHI insertion manually, or somehow
    managing arbitrary PHI-ing of EFLAGS with general PHI insertion.
    Neither of these seem at all promising so if those cases come up,
    we'll almost certainly need to rewrite the parts of LLVM that produce
    those patterns.

    We do now require dominator trees in order to reliably diagnose
    patterns that would require PHI nodes. This is a bit unfortunate but
    it seems better than the completely mysterious crash we would get
    otherwise.

    Differential Revision: https://reviews.llvm.org/D45673

  Together, these should ensure clang does not use pushf/popf sequences to
  save and restore flags, avoiding problems with unrelated flags (such as
  the interrupt flag) being restored unexpectedly.

  Requested by:	jtl
  PR:		225330

  MFC r332898:

  Pull in r329771 from upstream llvm trunk (by Craig Topper):

    [X86] In X86FlagsCopyLowering, when rewriting a memory setcc we need
    to emit an explicit MOV8mr instruction.

    Previously the code only knew how to handle setcc to a register.

    This should fix a crash in the chromium build.

  This fixes various assertion failures while building ports targeting
  i386:
  * www/firefox: isReg() && "This is not a register operand!"
  * www/iridium, www/qt5-webengine: (I.atEnd() || std::next(I) ==
    def_instr_end()) && "getVRegDef assumes a single definition or no
    definition"
  * devel/powerpc64-gcc: FromReg != ToReg && "Cannot replace a reg with
    itself"

  Reported by:	jbeich
  PR:		225330, 227686, 227698, 227699

Changes:
_U  stable/11/
  stable/11/contrib/llvm/include/llvm/CodeGen/MachineBasicBlock.h
  stable/11/contrib/llvm/lib/CodeGen/MachineBasicBlock.cpp
  stable/11/contrib/llvm/lib/Target/X86/Disassembler/X86Disassembler.cpp
  stable/11/contrib/llvm/lib/Target/X86/X86.h
  stable/11/contrib/llvm/lib/Target/X86/X86FlagsCopyLowering.cpp
  stable/11/contrib/llvm/lib/Target/X86/X86ISelLowering.cpp
  stable/11/contrib/llvm/lib/Target/X86/X86ISelLowering.h
  stable/11/contrib/llvm/lib/Target/X86/X86InstrCompiler.td
  stable/11/contrib/llvm/lib/Target/X86/X86InstrInfo.cpp
  stable/11/contrib/llvm/lib/Target/X86/X86InstrInfo.h
  stable/11/contrib/llvm/lib/Target/X86/X86InstrInfo.td
  stable/11/contrib/llvm/lib/Target/X86/X86InstrSystem.td
  stable/11/contrib/llvm/lib/Target/X86/X86RegisterInfo.td
  stable/11/contrib/llvm/lib/Target/X86/X86Schedule.td
  stable/11/contrib/llvm/lib/Target/X86/X86ScheduleAtom.td
  stable/11/contrib/llvm/lib/Target/X86/X86TargetMachine.cpp
  stable/11/contrib/llvm/tools/clang/include/clang/Driver/Options.td
  stable/11/contrib/llvm/tools/clang/lib/Basic/Targets/X86.cpp
  stable/11/contrib/llvm/tools/clang/lib/Basic/Targets/X86.h
  stable/11/lib/clang/freebsd_cc_version.h
  stable/11/lib/clang/libllvm/Makefile