FreeBSD 11.2-RELEASE FreeBSD 11.2-RELEASE #0 r335510: Fri Jun 22 04:32:14 UTC 2018 root@releng2.nyi.freebsd.org:/usr/obj/usr/src/sys/GENERIC amd64 Compiler: clang6, gcc9, bcc32x.exe (Embarcadero, Windows) Sample code: -------------------------------------------- static long double sd(void) { long double y; __asm__ ("\n\t" "fldpi \n\t" "fld1 \n\t" "fdivp \n\t" : "=t"(y) : : ); // st(1)=pi, st(0)=1 return y; } int main(void) { printf("\t%.18Lg\n", sd()); return 0; } -------------------------------------------- Output: 0.318309886183790672 = 1/pi But correct is: pi = pi/1 Content of file.o: 0b+096 c4 10 5d c3 55 48 89 e5 >> d9 eb, d9 e8, de f1 << 5d c3 . ^^^^^ ^^^^^ ^^^^^ Opcodes of 'fldpi', 'fld1' and /division/. There is the wrong opcode 'de f1' for 'fdivrp'! Not 'de f9' for the written 'fdivp'. These wrong translations appear on fdivxx and fsubxx.
11.x is no longer being supported. Can you help to check if this is still happening on 13.1 or even -CURRENT? Thanks!
(In reply to Li-Wen Hsu from comment #1) Just checked with clang-14 and gcc11 on -CURRENT/amd64: the result is identical to the one reported (0.31...). Disassembly of sd() with objdump: 00000000002018e0 <sd>: 2018e0: 55 push %rbp 2018e1: 48 89 e5 mov %rsp,%rbp 2018e4: d9 eb fldpi 2018e6: d9 e8 fld1 2018e8: de f1 fdivp %st,%st(1) 2018ea: db 7d f0 fstpt -0x10(%rbp) 2018ed: db 6d f0 fldt -0x10(%rbp) 2018f0: 5d pop %rbp 2018f1: c3 ret Intel Architecture Software Developer’s Manual - Volume 2: Instruction Set Reference: > Opcode Instruction Description > D8 /6 FDIV m32real Divide ST(0) by m32real and store result in ST(0) > DC /6 FDIV m64real Divide ST(0) by m64real and store result in ST(0) > D8 F0+i FDIV ST(0),ST(i) Divide ST(0) by ST(i) and store result in ST(0) > DC F8+i FDIV ST(i),ST(0) Divide ST(i) by ST(0) and store result in ST(i) > DE F8+i FDIVP ST(i),ST(0) Divide ST(i) by ST(0), store result in ST(i), and pop the register stack The byte sequence "de f1" does not exist in the reference manual, but if it is decoded by the processor, then the disassembled instruction "fdivp %st,%st(1)" might behave in this (undocumented) way: > DE F0+i FDIVP ST(0),ST(i) Divide ST(0) by ST(i), store result in ST(i), and pop the register stack And that would explain the result obtained. Maybe the assembler instruction "fdivp \n\t" is interpreted as if it was "fdivp %st,st(1) \n\t", i.e. with operands reversed from what you'd expect?
Intel: FDIV ST(0), ST(i) D8 F0+i Divide ST(0) by ST(i) and store result in ST(0). FDIV ST(i), ST(0) DC F8+i Divide ST(i) by ST(0) and store result in ST(i). FDIVP ST(i), ST(0) DE F8+i Divide ST(i) by ST(0), store result in ST(i), and pop the register stack. FDIVP DE F9 Divide ST(1) by ST(0), store result in ST(1), and pop the register stack. FDIVRP DE F1 Divide ST(0) by ST(1), store result in ST(1), and pop the register stack. AMD: FDIV ST(0), ST(i) D8 F0+i Replace ST(0) with ST(0)/ST(i). FDIV ST(i), ST(0) DC F8+i Replace ST(i) with ST(i)/ST(0). FDIVP ST(i), ST(0) DE F8+i Replace ST(i) with ST(i)/ST(0), and pop the x87 register stack. FDIVP DE F9 Replace ST(1) with ST(1)/ST(0), and pop the x87 register stack. My opcodes 'DE F9' and 'DE F1' are correct.
The assembler exchanges fdivp <==> fdivrp, fsubp <==> fsubrp Old code from 1991: I have therein to change fsubr --> fsub, fdivr --> fdiv to get _today_ correct behavior. -------------------------------------------------------- ; sc/24.1.91 TITLE acos87 .386 .387 .MODEL small PUBLIC _acos87 .DATA COMM _deg_87:DWORD .DATA? .CONST $radtodeg DT 57.295779513082320876798 ALIGN 4 .CODE _acos87 PROC fld QWORD PTR [esp+4] fld st fmul st, st fld1 fsubr fsqrt fdivr fld1 fpatan mov eax, _deg_87 cmp eax, 0 jg SHORT $deg ret
0000000000000024 <sd>: 24: 55 push %rbp 25: 48 89 e5 mov %rsp,%rbp 28: d9 eb fldpi 2a: d9 e8 fld1 2c: de f1 fdivp %st,%st(1) 2e: 5d pop %rbp 2f: c3 retq The 'objdump' works wrong too. 'de f1' is NOT the opcode of 'fdivp', but of 'fdivrp'!
Clang -S -masm=att test.c #APP fldpi fld1 fdivp %st(1) #NO_APP Clang -S -masm=intel test.c #APP fldpi fld1 fdivrp st(1) #NO_APP From _constant_ source: __asm__ ("\n\t" "fldpi \n\t" "fld1 \n\t" "fdivp \n\t" : "=t"(y) : : ); This wrong behavior is truly powerful...
Very old issue. See, for example: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=30117 Part of that has some history on the issue: Andrew Pinski 2006-12-07 22:40:52 UTC This is at most a GNU binutils bug. Please file it with them at http://sourceware.org/bugzilla/ . Also IIRC fdivp's arguments are swapped in AT&T asm mode because of some historical accident. See the comment in i386.c: /* The SystemV/386 SVR3.2 assembler, and probably all AT&T derived assemblers, confusingly reverse the direction of the operation for fsub{r} and fdiv{r} when the destination register is not st(0). The Intel assembler doesn't have this brain damage. Read !SYSV386_COMPAT to figure out what the hardware really does. */ Also: #ifndef SYSV386_COMPAT /* Set to 1 for compatibility with brain-damaged assemblers. No-one wants to fix the assemblers because that causes incompatibility with gcc. No-one wants to fix gcc because that causes incompatibility with assemblers... You can use the option of -DSYSV386_COMPAT=0 if you recompile both gcc and gas this way. */ #define SYSV386_COMPAT 1 #endif
# define EXCHANGE 1 #if EXCHANGE > 0 # define fsubp "fsubrp" # define fsubrp "fsubp" # define fdivp "fdivrp" # define fdivrp "fdivp" #else # define fsubp "fsubp" # define fsubrp "fsubrp" # define fdivp "fdivp" # define fdivrp "fdivrp" #endif #undef EXCHANGE A countermeasure.