llvm-project/llvm/lib
Ahmed Bougacha 23a0d1a1d6 [X86] Don't custom-lower vNi32 uint_to_fp when unsafe-fp-math.
The custom code produces incorrect results if later reassociated.

Since r221657, on x86, vNi32 uitofp is lowered using an optimized
sequence:

  movdqa LCPI0_0(%rip), %xmm1 ## xmm1 = [65535, ...]
  pand %xmm0, %xmm1
  por LCPI0_1(%rip), %xmm1 ## [0x4b000000, ...]
  psrld $16, %xmm0
  por LCPI0_2(%rip), %xmm0 ## [0x53000000, ...]
  addps LCPI0_3(%rip), %xmm0 ## [float -5.497642e+11, ...]
  addps %xmm1, %xmm0

Since r240361, the machine combiner opportunistically reassociates
2-instruction sequences (with -ffast-math). In the new code sequence,
the ADDPS' are eligible. In isolation, for simple examples (without
reassociable users), this makes no performance difference (the goal
being to enable reassociation of longer chains).

In the trivial example (just one uitofp), the reassociation doesn't
happen, because (I think) it would require the emission of a separate
movaps for a constantpool load (instead of folding it into addps).

However, when we have multiple uitofp sequences, and the constantpool
loads are CSE'd earlier, the machine combiner can do the reassociation.

When the ADDPS' are reassociated, the resulting sequence isn't correct
anymore, as we'd be adding large (2**39) constants with comparatively
smaller values (~2**23). Given that two of the three inputs are powers
of 2 larger than 2**16, and that ulp(2**39) == 2**(39-24) == 2**15,
the reassociated chain will produce 0 for any input in [0, 2**14[.
In my testing, it also produces wrong results for 99.5% of [0, 2**32[.

Avoid this by disabling the new lowering when -ffast-math. It does
mean that we'll get slower code than without it, but at least we
won't get egregiously incorrect code.

One might argue that, considering -ffast-math is all but meaningless,
uitofp producing wrong results isn't a compiler bug. But it really is.

Fixes PR24512.

...though this is really more of a workaround.
Ideally, we'd have some sort of Machine FMF, but that's a problem
that's not worth tackling until we do more with machine IR.

llvm-svn: 248965
2015-10-01 00:11:07 +00:00
..
Analysis Refactor computeKnownBits alignment handling code 2015-09-30 11:55:45 +00:00
AsmParser HHVM calling conventions. 2015-09-29 22:09:16 +00:00
Bitcode [Bitcode][Asm] Teach LLVM to read and write operand bundles. 2015-09-24 23:34:52 +00:00
CodeGen [WinEH] Emit int3 after noreturn calls on Win64 2015-09-30 23:09:23 +00:00
DebugInfo Use makeArrayRef or None to avoid unnecessarily mentioning the ArrayRef type extra times. NFC 2015-09-21 05:32:41 +00:00
ExecutionEngine Remove roundingMode argument in APFloat::mod 2015-09-21 19:29:25 +00:00
Fuzzer [libFuzzer] Marking exported symbols as visible. Patch by Mike Aizatsky 2015-09-30 22:22:37 +00:00
IR Fix debug info with SafeStack. 2015-09-30 19:55:43 +00:00
IRReader Return a unique_ptr from getLazyBitcodeModule and parseBitcodeFile. NFC. 2015-06-16 22:27:55 +00:00
LTO Reapply "LTO: Disable extra verify runs in release builds" 2015-09-15 23:05:59 +00:00
LibDriver There is only one saver of strings. 2015-08-13 01:07:02 +00:00
LineEditor Use ADDITIONAL_HEADER_DIRS in all LLVM CMake projects. 2015-02-11 03:28:02 +00:00
Linker [opaque pointer types] Switch a few cases of getElementType over, since I had them lying around anyway 2015-09-14 20:29:26 +00:00
MC MCAsmInfo: Allow targets to specify when the .section directive should be omitted 2015-09-25 21:41:14 +00:00
Object Prune trailing whitespaces. 2015-09-22 11:19:03 +00:00
Option Add an ArgList::AddAllArgs that accepts a vector of OptSpecifier. 2015-07-29 17:34:41 +00:00
Passes [PM] Port SROA to the new pass manager. 2015-09-12 09:09:14 +00:00
ProfileData InstrProf: Don't call std::unique twice here 2015-09-30 02:02:08 +00:00
Support [BranchProbability] Manually round the floating point output. 2015-09-26 10:09:36 +00:00
TableGen TableGen: Support folding casts from bits to int 2015-07-31 01:12:06 +00:00
Target [X86] Don't custom-lower vNi32 uint_to_fp when unsafe-fp-math. 2015-10-01 00:11:07 +00:00
Transforms [SLP] Don't vectorize loads of non-packed types (like i1, i2). 2015-09-30 21:05:43 +00:00
CMakeLists.txt LibDriver, llvm-lib: introduce. 2015-06-09 21:50:22 +00:00
LLVMBuild.txt Wrap some long lines in LLVMBuild files. NFC 2015-06-12 18:44:57 +00:00
Makefile LibDriver, llvm-lib: introduce. 2015-06-09 21:50:22 +00:00