Created attachment 189922 [details] Patch to make clang do a slightly better job at restoring the EFLAGS register While making a change to kernel code, I recently discovered a compiler bug that can incorrectly enable or disable interrupts on amd64. Imagine this code: void fx(void) { _Bool needtocall; mtx_lock_spin(&lock); needtocall = (--count == 0); mtx_unlock_spin(&lock); if (needtocall) otherfx(); } If you compile this code in clang (5.0.0, 5.0.1, or 6.0.0 all behave similarly in the critical respect) either in a kernel module or with mutex inlining disabled, you get this assembly (dumped with `objdump -S` to interleave the assembly and source code): void fx(void) { 0: 55 push %rbp 1: 48 89 e5 mov %rsp,%rbp _Bool needtocall; mtx_lock_spin(&lock); 4: 53 push %rbx 5: 50 push %rax 6: 48 c7 c7 00 00 00 00 mov $0x0,%rdi d: 31 f6 xor %esi,%esi f: 48 c7 c2 00 00 00 00 mov $0x0,%rdx 16: b9 18 00 00 00 mov $0x18,%ecx 1b: e8 00 00 00 00 callq 20 <fx+0x20> needtocall = (--count == 0); 20: ff 0c 25 00 00 00 00 decl 0x0 27: 9c pushfq 28: 5b pop %rbx mtx_unlock_spin(&lock); 29: 48 c7 c7 00 00 00 00 mov $0x0,%rdi 30: 31 f6 xor %esi,%esi 32: 48 c7 c2 00 00 00 00 mov $0x0,%rdx 39: b9 1a 00 00 00 mov $0x1a,%ecx 3e: e8 00 00 00 00 callq 43 <fx+0x43> if (needtocall) otherfx(); 43: 48 83 c4 08 add $0x8,%rsp 47: 53 push %rbx 48: 9d popfq _Bool needtocall; mtx_lock_spin(&lock); needtocall = (--count == 0); mtx_unlock_spin(&lock); if (needtocall) 49: 74 03 je 4e <fx+0x4e> otherfx(); } 4b: 5b pop %rbx 4c: 5d pop %rbp 4d: c3 retq mtx_lock_spin(&lock); needtocall = (--count == 0); mtx_unlock_spin(&lock); if (needtocall) otherfx(); 4e: 5b pop %rbx 4f: 5d pop %rbp 50: e9 00 00 00 00 jmpq 55 <count+0x51> The critical part is the way that clang handles the conditional: 1. Decrement the value. This will set ZF if count reaches 0: 20: ff 0c 25 00 00 00 00 decl 0x0 2. Now, save the status flags (particularly, ZF) using pushfq. Critically, pushfq saves the entire EFLAGS value, which includes the interrupt flag. Because we are holding a spin mutex, interrupts are disabled at this point: 27: 9c pushfq 28: 5b pop %rbx 3. Now, call the function to unlock the spin mutex, which will potentially enable interrupts. 4. Now, restore the status flags. Critically, popfq restores the entire EFLAGS value, which includes the interrupt flag. Because interrupts were disabled when the flags were stored, these instructions will disable them again: 43: 48 83 c4 08 add $0x8,%rsp 47: 53 push %rbx 48: 9d popfq 5. Now, use the status flags to determine whether to call the function: 49: 74 03 je 4e <fx+0x4e> After running this code, interrupts may remain erroneously disabled. (This is a contrived minimal test case, but I did run into this bug while testing a code change in the kernel. Interrupts remained erroneously disabled after dropping a spin lock. And, in that case, the code was in the actual kernel -- not a module.) By now, the bug should be obvious: clang shouldn't save and restore the entire EFLAGS register. It doesn't know what changes these other function calls make to the values of the EFLAGS register about which clang doesn't care (e.g. IF, IOPL, AC, etc.). By saving and restoring EFLAGS across function calls, it can load incorrect values for these other flags. It looks like the correct solutions are to either not save EFLAGS across function calls, or to just save the individual flag(s) that need to be evaluated across the function call. (For example, a simple setz/test would probably be sufficient -- and more efficient -- in this case.) However, these are very significant changes that are complicated. Two other things to note: 1. If clang knows that the target hardware supports lahf/sahf, it will use those instructions instead. That solves the problem, as it doesn't reset any flags other than status flags. (It still may not be as efficient as setz, but at least it is safe for kernel code.) lahf/sahf are supported on all i386 and *most* (but not all) amd64 CPUs. (Apparently, a few older amd64 CPUs didn't support the instruction.) Therefore, clang assumes lahf/sahf are supported on all i386 architectures, but assume it is not supported on amd64 unless the user specifies a target CPU class known to support it. This is a long way of saying that the default amd64 FreeBSD build assumes that it is not supported, leading to the buggy use of pushf/popf shown above. 2. In practice, this seems to very seldom show up in our kernel binaries. Even a trivial change to the above code (for example, "--count == 5" instead of "--count == 0") would make the compiler no longer save the EFLAGS register across the function call. I quickly audited the kernel and modules I compile and use at work, and found only one place that compiles with the pushf/popf instructions to save/restore the EFLAGS register across a function call. (Of course, this fragility can work in the reverse: a trivial change, such as the one I tried when I found this bug, can cause the compiler to use the pushf/popf instructions where it had not previously and, therefore, introduce a bug in interrupt handling.) Finally, I have a patch that "fixes" (or, probably more accurately, "works around") the bug. It changes the pushf/popf so that the program will retrieve the latest EFLAGS register at restore time and only modify the status bits. In C Pseudo-code, it does this: save_eflags(register dest) { dest = load_eflags(); dest &= (OF|SF|ZF|AF|PF|CF); } restore_eflags(register src) { register RAX; RAX = load_eflags(); RAX &= ~(OF|SF|ZF|AF|PF|CF); RAX |= src; set_eflags(RAX); } The new assembly sequence looks like this: Save EFLAGS: 28: 9c pushfq 29: 5b pop %rbx 2a: 48 81 e3 d5 08 00 00 and $0x8d5,%rbx Restore EFLAGS: 4f: 9c pushfq 50: 48 8b 04 24 mov (%rsp),%rax 54: 48 25 2a f7 ff ff and $0xfffffffffffff72a,%rax 5a: 48 09 d8 or %rbx,%rax 5d: 48 89 04 24 mov %rax,(%rsp) 61: 9d popfq This is far from ideal, but is better than what is there. And, the good news is that the compiler seems to actually save/restore EFLAGS very few places in our kernel code (hopefully, none which are in hot code paths). So, perhaps, this is good enough for now. Yet another alternative would be to enable lahf/sahf by default, and provide a way for people to disable it if their CPU doesn't support those instructions (with an appropriate warning about how it may be buggy).
It would be much better to take this issue upstream. We would not want to carry patches like this without them having been checked over first.
(In reply to Dimitry Andric from comment #1) I filed an upstream bug report today: https://bugs.llvm.org/show_bug.cgi?id=36028.
The upstream bug is fixed in r329657 and r329673. Also, dim@ fixed a related bug that allows us to use -msahf as a workaround (committed in upstream r325446). Once those revisions appear in our tree, this bug should be fixed.
A commit references this bug: Author: dim Date: Sat Apr 14 12:07:07 UTC 2018 New revision: 332501 URL: https://svnweb.freebsd.org/changeset/base/332501 Log: Pull in r325446 from upstream clang trunk (by me): [X86] Add 'sahf' CPU feature to frontend Summary: Make clang accept `-msahf` (and `-mno-sahf`) flags to activate the `+sahf` feature for the backend, for bug 36028 (Incorrect use of pushf/popf enables/disables interrupts on amd64 kernels). This was originally submitted in bug 36037 by Jonathan Looney <jonlooney@gmail.com>. As described there, GCC also uses `-msahf` for this feature, and the backend already recognizes the `+sahf` feature. All that is needed is to teach clang to pass this on to the backend. The mapping of feature support onto CPUs may not be complete; rather, it was chosen to match LLVM's idea of which CPUs support this feature (see lib/Target/X86/X86.td). I also updated the affected test case (CodeGen/attr-target-x86.c) to match the emitted output. Reviewers: craig.topper, coby, efriedma, rsmith Reviewed By: craig.topper Subscribers: emaste, cfe-commits Differential Revision: https://reviews.llvm.org/D43394 Pull in r328944 from upstream llvm trunk (by Chandler Carruth): [x86] Expose more of the condition conversion routines in the public API for X86's instruction information. I've now got a second patch under review that needs these same APIs. This bit is nicely orthogonal and obvious, so landing it. NFC. Pull in r329414 from upstream llvm trunk (by Craig Topper): [X86] Merge itineraries for CLC, CMC, and STC. These are very simple flag setting instructions that appear to only be a single uop. They're unlikely to need this separation. Pull in r329657 from upstream llvm trunk (by Chandler Carruth): [x86] Introduce a pass to begin more systematically fixing PR36028 and similar issues. The key idea is to lower COPY nodes populating EFLAGS by scanning the uses of EFLAGS and introducing dedicated code to preserve the necessary state in a GPR. In the vast majority of cases, these uses are cmovCC and jCC instructions. For such cases, we can very easily save and restore the necessary information by simply inserting a setCC into a GPR where the original flags are live, and then testing that GPR directly to feed the cmov or conditional branch. However, things are a bit more tricky if arithmetic is using the flags. This patch handles the vast majority of cases that seem to come up in practice: adc, adcx, adox, rcl, and rcr; all without taking advantage of partially preserved EFLAGS as LLVM doesn't currently model that at all. There are a large number of operations that techinaclly observe EFLAGS currently but shouldn't in this case -- they typically are using DF. Currently, they will not be handled by this approach. However, I have never seen this issue come up in practice. It is already pretty rare to have these patterns come up in practical code with LLVM. I had to resort to writing MIR tests to cover most of the logic in this pass already. I suspect even with its current amount of coverage of arithmetic users of EFLAGS it will be a significant improvement over the current use of pushf/popf. It will also produce substantially faster code in most of the common patterns. This patch also removes all of the old lowering for EFLAGS copies, and the hack that forced us to use a frame pointer when EFLAGS copies were found anywhere in a function so that the dynamic stack adjustment wasn't a problem. None of this is needed as we now lower all of these copies directly in MI and without require stack adjustments. Lots of thanks to Reid who came up with several aspects of this approach, and Craig who helped me work out a couple of things tripping me up while working on this. Differential Revision: https://reviews.llvm.org/D45146 Pull in r329673 from upstream llvm trunk (by Chandler Carruth): [x86] Model the direction flag (DF) separately from the rest of EFLAGS. This cleans up a number of operations that only claimed te use EFLAGS due to using DF. But no instructions which we think of us setting EFLAGS actually modify DF (other than things like popf) and so this needlessly creates uses of EFLAGS that aren't really there. In fact, DF is so restrictive it is pretty easy to model. Only STD, CLD, and the whole-flags writes (WRFLAGS and POPF) need to model this. I've also somewhat cleaned up some of the flag management instruction definitions to be in the correct .td file. Adding this extra register also uncovered a failure to use the correct datatype to hold X86 registers, and I've corrected that as necessary here. Differential Revision: https://reviews.llvm.org/D45154 Together, these should ensure clang does not use pushf/popf sequences to save and restore flags, avoiding problems with unrelated flags (such as the interrupt flag) being restored unexpectedly. Requested by: jtl PR: 225330 MFC after: 1 week Changes: head/contrib/llvm/include/llvm/CodeGen/MachineBasicBlock.h head/contrib/llvm/lib/CodeGen/MachineBasicBlock.cpp head/contrib/llvm/lib/Target/X86/Disassembler/X86Disassembler.cpp head/contrib/llvm/lib/Target/X86/X86.h head/contrib/llvm/lib/Target/X86/X86FlagsCopyLowering.cpp head/contrib/llvm/lib/Target/X86/X86ISelLowering.cpp head/contrib/llvm/lib/Target/X86/X86ISelLowering.h head/contrib/llvm/lib/Target/X86/X86InstrCompiler.td head/contrib/llvm/lib/Target/X86/X86InstrInfo.cpp head/contrib/llvm/lib/Target/X86/X86InstrInfo.h head/contrib/llvm/lib/Target/X86/X86InstrInfo.td head/contrib/llvm/lib/Target/X86/X86InstrSystem.td head/contrib/llvm/lib/Target/X86/X86RegisterInfo.td head/contrib/llvm/lib/Target/X86/X86Schedule.td head/contrib/llvm/lib/Target/X86/X86ScheduleAtom.td head/contrib/llvm/lib/Target/X86/X86TargetMachine.cpp head/contrib/llvm/tools/clang/include/clang/Driver/Options.td head/contrib/llvm/tools/clang/lib/Basic/Targets/X86.cpp head/contrib/llvm/tools/clang/lib/Basic/Targets/X86.h head/lib/clang/freebsd_cc_version.h head/lib/clang/libllvm/Makefile
A commit references this bug: Author: dim Date: Sat Apr 14 14:57:34 UTC 2018 New revision: 332503 URL: https://svnweb.freebsd.org/changeset/base/332503 Log: Revert r332501 for now, as it can cause build failures on i386. Reported upstream as <https://bugs.llvm.org/show_bug.cgi?id=37133>. Reported by: emaste, ci.freebsd.org PR: 225330 Changes: head/contrib/llvm/include/llvm/CodeGen/MachineBasicBlock.h head/contrib/llvm/lib/CodeGen/MachineBasicBlock.cpp head/contrib/llvm/lib/Target/X86/Disassembler/X86Disassembler.cpp head/contrib/llvm/lib/Target/X86/X86.h head/contrib/llvm/lib/Target/X86/X86FlagsCopyLowering.cpp head/contrib/llvm/lib/Target/X86/X86ISelLowering.cpp head/contrib/llvm/lib/Target/X86/X86ISelLowering.h head/contrib/llvm/lib/Target/X86/X86InstrCompiler.td head/contrib/llvm/lib/Target/X86/X86InstrInfo.cpp head/contrib/llvm/lib/Target/X86/X86InstrInfo.h head/contrib/llvm/lib/Target/X86/X86InstrInfo.td head/contrib/llvm/lib/Target/X86/X86InstrSystem.td head/contrib/llvm/lib/Target/X86/X86RegisterInfo.td head/contrib/llvm/lib/Target/X86/X86Schedule.td head/contrib/llvm/lib/Target/X86/X86ScheduleAtom.td head/contrib/llvm/lib/Target/X86/X86TargetMachine.cpp head/contrib/llvm/tools/clang/include/clang/Driver/Options.td head/contrib/llvm/tools/clang/lib/Basic/Targets/X86.cpp head/contrib/llvm/tools/clang/lib/Basic/Targets/X86.h head/lib/clang/freebsd_cc_version.h head/lib/clang/libllvm/Makefile
A commit references this bug: Author: dim Date: Fri Apr 20 18:20:56 UTC 2018 New revision: 332833 URL: https://svnweb.freebsd.org/changeset/base/332833 Log: Recommit r332501, with an additional upstream fix for "Cannot lower EFLAGS copy that lives out of a basic block!" errors on i386. Pull in r325446 from upstream clang trunk (by me): [X86] Add 'sahf' CPU feature to frontend Summary: Make clang accept `-msahf` (and `-mno-sahf`) flags to activate the `+sahf` feature for the backend, for bug 36028 (Incorrect use of pushf/popf enables/disables interrupts on amd64 kernels). This was originally submitted in bug 36037 by Jonathan Looney <jonlooney@gmail.com>. As described there, GCC also uses `-msahf` for this feature, and the backend already recognizes the `+sahf` feature. All that is needed is to teach clang to pass this on to the backend. The mapping of feature support onto CPUs may not be complete; rather, it was chosen to match LLVM's idea of which CPUs support this feature (see lib/Target/X86/X86.td). I also updated the affected test case (CodeGen/attr-target-x86.c) to match the emitted output. Reviewers: craig.topper, coby, efriedma, rsmith Reviewed By: craig.topper Subscribers: emaste, cfe-commits Differential Revision: https://reviews.llvm.org/D43394 Pull in r328944 from upstream llvm trunk (by Chandler Carruth): [x86] Expose more of the condition conversion routines in the public API for X86's instruction information. I've now got a second patch under review that needs these same APIs. This bit is nicely orthogonal and obvious, so landing it. NFC. Pull in r329414 from upstream llvm trunk (by Craig Topper): [X86] Merge itineraries for CLC, CMC, and STC. These are very simple flag setting instructions that appear to only be a single uop. They're unlikely to need this separation. Pull in r329657 from upstream llvm trunk (by Chandler Carruth): [x86] Introduce a pass to begin more systematically fixing PR36028 and similar issues. The key idea is to lower COPY nodes populating EFLAGS by scanning the uses of EFLAGS and introducing dedicated code to preserve the necessary state in a GPR. In the vast majority of cases, these uses are cmovCC and jCC instructions. For such cases, we can very easily save and restore the necessary information by simply inserting a setCC into a GPR where the original flags are live, and then testing that GPR directly to feed the cmov or conditional branch. However, things are a bit more tricky if arithmetic is using the flags. This patch handles the vast majority of cases that seem to come up in practice: adc, adcx, adox, rcl, and rcr; all without taking advantage of partially preserved EFLAGS as LLVM doesn't currently model that at all. There are a large number of operations that techinaclly observe EFLAGS currently but shouldn't in this case -- they typically are using DF. Currently, they will not be handled by this approach. However, I have never seen this issue come up in practice. It is already pretty rare to have these patterns come up in practical code with LLVM. I had to resort to writing MIR tests to cover most of the logic in this pass already. I suspect even with its current amount of coverage of arithmetic users of EFLAGS it will be a significant improvement over the current use of pushf/popf. It will also produce substantially faster code in most of the common patterns. This patch also removes all of the old lowering for EFLAGS copies, and the hack that forced us to use a frame pointer when EFLAGS copies were found anywhere in a function so that the dynamic stack adjustment wasn't a problem. None of this is needed as we now lower all of these copies directly in MI and without require stack adjustments. Lots of thanks to Reid who came up with several aspects of this approach, and Craig who helped me work out a couple of things tripping me up while working on this. Differential Revision: https://reviews.llvm.org/D45146 Pull in r329673 from upstream llvm trunk (by Chandler Carruth): [x86] Model the direction flag (DF) separately from the rest of EFLAGS. This cleans up a number of operations that only claimed te use EFLAGS due to using DF. But no instructions which we think of us setting EFLAGS actually modify DF (other than things like popf) and so this needlessly creates uses of EFLAGS that aren't really there. In fact, DF is so restrictive it is pretty easy to model. Only STD, CLD, and the whole-flags writes (WRFLAGS and POPF) need to model this. I've also somewhat cleaned up some of the flag management instruction definitions to be in the correct .td file. Adding this extra register also uncovered a failure to use the correct datatype to hold X86 registers, and I've corrected that as necessary here. Differential Revision: https://reviews.llvm.org/D45154 Pull in r330264 from upstream llvm trunk (by Chandler Carruth): [x86] Fix PR37100 by teaching the EFLAGS copy lowering to rewrite uses across basic blocks in the limited cases where it is very straight forward to do so. This will also be useful for other places where we do some limited EFLAGS propagation across CFG edges and need to handle copy rewrites afterward. I think this is rapidly approaching the maximum we can and should be doing here. Everything else begins to require either heroic analysis to prove how to do PHI insertion manually, or somehow managing arbitrary PHI-ing of EFLAGS with general PHI insertion. Neither of these seem at all promising so if those cases come up, we'll almost certainly need to rewrite the parts of LLVM that produce those patterns. We do now require dominator trees in order to reliably diagnose patterns that would require PHI nodes. This is a bit unfortunate but it seems better than the completely mysterious crash we would get otherwise. Differential Revision: https://reviews.llvm.org/D45673 Together, these should ensure clang does not use pushf/popf sequences to save and restore flags, avoiding problems with unrelated flags (such as the interrupt flag) being restored unexpectedly. Requested by: jtl PR: 225330 MFC after: 1 week Changes: head/contrib/llvm/include/llvm/CodeGen/MachineBasicBlock.h head/contrib/llvm/lib/CodeGen/MachineBasicBlock.cpp head/contrib/llvm/lib/Target/X86/Disassembler/X86Disassembler.cpp head/contrib/llvm/lib/Target/X86/X86.h head/contrib/llvm/lib/Target/X86/X86FlagsCopyLowering.cpp head/contrib/llvm/lib/Target/X86/X86ISelLowering.cpp head/contrib/llvm/lib/Target/X86/X86ISelLowering.h head/contrib/llvm/lib/Target/X86/X86InstrCompiler.td head/contrib/llvm/lib/Target/X86/X86InstrInfo.cpp head/contrib/llvm/lib/Target/X86/X86InstrInfo.h head/contrib/llvm/lib/Target/X86/X86InstrInfo.td head/contrib/llvm/lib/Target/X86/X86InstrSystem.td head/contrib/llvm/lib/Target/X86/X86RegisterInfo.td head/contrib/llvm/lib/Target/X86/X86Schedule.td head/contrib/llvm/lib/Target/X86/X86ScheduleAtom.td head/contrib/llvm/lib/Target/X86/X86TargetMachine.cpp head/contrib/llvm/tools/clang/include/clang/Driver/Options.td head/contrib/llvm/tools/clang/lib/Basic/Targets/X86.cpp head/contrib/llvm/tools/clang/lib/Basic/Targets/X86.h head/lib/clang/freebsd_cc_version.h head/lib/clang/libllvm/Makefile
A commit references this bug: Author: brooks Date: Fri Apr 20 22:46:23 UTC 2018 New revision: 467849 URL: https://svnweb.freebsd.org/changeset/ports/467849 Log: Merge r332833 from FreeBSD HEAD. This should ensure clang does not use pushf/popf sequences to save and restore flags, avoiding problems with unrelated flags (such as the interrupt flag) being restored unexpectedly. PR: 225330 Changes: head/devel/llvm60/Makefile head/devel/llvm60/files/clang/patch-fsvn-r332833-clang head/devel/llvm60/files/patch-fsvn-r332833
A commit references this bug: Author: dim Date: Mon Apr 23 23:07:58 UTC 2018 New revision: 332898 URL: https://svnweb.freebsd.org/changeset/base/332898 Log: Pull in r329771 from upstream llvm trunk (by Craig Topper): [X86] In X86FlagsCopyLowering, when rewriting a memory setcc we need to emit an explicit MOV8mr instruction. Previously the code only knew how to handle setcc to a register. This should fix a crash in the chromium build. This fixes various assertion failures while building ports targeting i386: * www/firefox: isReg() && "This is not a register operand!" * www/iridium, www/qt5-webengine: (I.atEnd() || std::next(I) == def_instr_end()) && "getVRegDef assumes a single definition or no definition" * devel/powerpc64-gcc: FromReg != ToReg && "Cannot replace a reg with itself" Reported by: jbeich PR: 225330, 227686, 227698, 227699 MFC after: 1 week X-MFC-With: r332833 Changes: head/contrib/llvm/lib/Target/X86/X86FlagsCopyLowering.cpp
A commit references this bug: Author: dim Date: Fri Apr 27 19:21:41 UTC 2018 New revision: 333070 URL: https://svnweb.freebsd.org/changeset/base/333070 Log: MFC r332833: Recommit r332501, with an additional upstream fix for "Cannot lower EFLAGS copy that lives out of a basic block!" errors on i386. Pull in r325446 from upstream clang trunk (by me): [X86] Add 'sahf' CPU feature to frontend Summary: Make clang accept `-msahf` (and `-mno-sahf`) flags to activate the `+sahf` feature for the backend, for bug 36028 (Incorrect use of pushf/popf enables/disables interrupts on amd64 kernels). This was originally submitted in bug 36037 by Jonathan Looney <jonlooney@gmail.com>. As described there, GCC also uses `-msahf` for this feature, and the backend already recognizes the `+sahf` feature. All that is needed is to teach clang to pass this on to the backend. The mapping of feature support onto CPUs may not be complete; rather, it was chosen to match LLVM's idea of which CPUs support this feature (see lib/Target/X86/X86.td). I also updated the affected test case (CodeGen/attr-target-x86.c) to match the emitted output. Reviewers: craig.topper, coby, efriedma, rsmith Reviewed By: craig.topper Subscribers: emaste, cfe-commits Differential Revision: https://reviews.llvm.org/D43394 Pull in r328944 from upstream llvm trunk (by Chandler Carruth): [x86] Expose more of the condition conversion routines in the public API for X86's instruction information. I've now got a second patch under review that needs these same APIs. This bit is nicely orthogonal and obvious, so landing it. NFC. Pull in r329414 from upstream llvm trunk (by Craig Topper): [X86] Merge itineraries for CLC, CMC, and STC. These are very simple flag setting instructions that appear to only be a single uop. They're unlikely to need this separation. Pull in r329657 from upstream llvm trunk (by Chandler Carruth): [x86] Introduce a pass to begin more systematically fixing PR36028 and similar issues. The key idea is to lower COPY nodes populating EFLAGS by scanning the uses of EFLAGS and introducing dedicated code to preserve the necessary state in a GPR. In the vast majority of cases, these uses are cmovCC and jCC instructions. For such cases, we can very easily save and restore the necessary information by simply inserting a setCC into a GPR where the original flags are live, and then testing that GPR directly to feed the cmov or conditional branch. However, things are a bit more tricky if arithmetic is using the flags. This patch handles the vast majority of cases that seem to come up in practice: adc, adcx, adox, rcl, and rcr; all without taking advantage of partially preserved EFLAGS as LLVM doesn't currently model that at all. There are a large number of operations that techinaclly observe EFLAGS currently but shouldn't in this case -- they typically are using DF. Currently, they will not be handled by this approach. However, I have never seen this issue come up in practice. It is already pretty rare to have these patterns come up in practical code with LLVM. I had to resort to writing MIR tests to cover most of the logic in this pass already. I suspect even with its current amount of coverage of arithmetic users of EFLAGS it will be a significant improvement over the current use of pushf/popf. It will also produce substantially faster code in most of the common patterns. This patch also removes all of the old lowering for EFLAGS copies, and the hack that forced us to use a frame pointer when EFLAGS copies were found anywhere in a function so that the dynamic stack adjustment wasn't a problem. None of this is needed as we now lower all of these copies directly in MI and without require stack adjustments. Lots of thanks to Reid who came up with several aspects of this approach, and Craig who helped me work out a couple of things tripping me up while working on this. Differential Revision: https://reviews.llvm.org/D45146 Pull in r329673 from upstream llvm trunk (by Chandler Carruth): [x86] Model the direction flag (DF) separately from the rest of EFLAGS. This cleans up a number of operations that only claimed te use EFLAGS due to using DF. But no instructions which we think of us setting EFLAGS actually modify DF (other than things like popf) and so this needlessly creates uses of EFLAGS that aren't really there. In fact, DF is so restrictive it is pretty easy to model. Only STD, CLD, and the whole-flags writes (WRFLAGS and POPF) need to model this. I've also somewhat cleaned up some of the flag management instruction definitions to be in the correct .td file. Adding this extra register also uncovered a failure to use the correct datatype to hold X86 registers, and I've corrected that as necessary here. Differential Revision: https://reviews.llvm.org/D45154 Pull in r330264 from upstream llvm trunk (by Chandler Carruth): [x86] Fix PR37100 by teaching the EFLAGS copy lowering to rewrite uses across basic blocks in the limited cases where it is very straight forward to do so. This will also be useful for other places where we do some limited EFLAGS propagation across CFG edges and need to handle copy rewrites afterward. I think this is rapidly approaching the maximum we can and should be doing here. Everything else begins to require either heroic analysis to prove how to do PHI insertion manually, or somehow managing arbitrary PHI-ing of EFLAGS with general PHI insertion. Neither of these seem at all promising so if those cases come up, we'll almost certainly need to rewrite the parts of LLVM that produce those patterns. We do now require dominator trees in order to reliably diagnose patterns that would require PHI nodes. This is a bit unfortunate but it seems better than the completely mysterious crash we would get otherwise. Differential Revision: https://reviews.llvm.org/D45673 Together, these should ensure clang does not use pushf/popf sequences to save and restore flags, avoiding problems with unrelated flags (such as the interrupt flag) being restored unexpectedly. Requested by: jtl PR: 225330 MFC r332898: Pull in r329771 from upstream llvm trunk (by Craig Topper): [X86] In X86FlagsCopyLowering, when rewriting a memory setcc we need to emit an explicit MOV8mr instruction. Previously the code only knew how to handle setcc to a register. This should fix a crash in the chromium build. This fixes various assertion failures while building ports targeting i386: * www/firefox: isReg() && "This is not a register operand!" * www/iridium, www/qt5-webengine: (I.atEnd() || std::next(I) == def_instr_end()) && "getVRegDef assumes a single definition or no definition" * devel/powerpc64-gcc: FromReg != ToReg && "Cannot replace a reg with itself" Reported by: jbeich PR: 225330, 227686, 227698, 227699 Changes: _U stable/11/ stable/11/contrib/llvm/include/llvm/CodeGen/MachineBasicBlock.h stable/11/contrib/llvm/lib/CodeGen/MachineBasicBlock.cpp stable/11/contrib/llvm/lib/Target/X86/Disassembler/X86Disassembler.cpp stable/11/contrib/llvm/lib/Target/X86/X86.h stable/11/contrib/llvm/lib/Target/X86/X86FlagsCopyLowering.cpp stable/11/contrib/llvm/lib/Target/X86/X86ISelLowering.cpp stable/11/contrib/llvm/lib/Target/X86/X86ISelLowering.h stable/11/contrib/llvm/lib/Target/X86/X86InstrCompiler.td stable/11/contrib/llvm/lib/Target/X86/X86InstrInfo.cpp stable/11/contrib/llvm/lib/Target/X86/X86InstrInfo.h stable/11/contrib/llvm/lib/Target/X86/X86InstrInfo.td stable/11/contrib/llvm/lib/Target/X86/X86InstrSystem.td stable/11/contrib/llvm/lib/Target/X86/X86RegisterInfo.td stable/11/contrib/llvm/lib/Target/X86/X86Schedule.td stable/11/contrib/llvm/lib/Target/X86/X86ScheduleAtom.td stable/11/contrib/llvm/lib/Target/X86/X86TargetMachine.cpp stable/11/contrib/llvm/tools/clang/include/clang/Driver/Options.td stable/11/contrib/llvm/tools/clang/lib/Basic/Targets/X86.cpp stable/11/contrib/llvm/tools/clang/lib/Basic/Targets/X86.h stable/11/lib/clang/freebsd_cc_version.h stable/11/lib/clang/libllvm/Makefile