Summary: Among other things, this allows -print-after-all/-print-before-all to
dump IR around this pass.
IIRC, this pass is off by default, but it's still helpful when debugging.
llvm-svn: 244056
We can't propagate FMF partially without breaking DAG-level CSE. We either need to
relax CSE to account for mismatched FMF as a temporary work-around or fully propagate
FMF throughout the DAG.
Surprisingly, there are no existing regression tests for this, but here's an example:
define float @fmf(float %a, float %b) {
%mul1 = fmul fast float %a, %b
%nega = fsub fast float 0.0, %a
%mul2 = fmul fast float %nega, %b
%abx2 = fsub fast float %mul1, %mul2
ret float %abx2
}
$ llc -o - badflags.ll -march=x86-64 -mattr=fma -enable-unsafe-fp-math -enable-fmf-dag=0
...
vmulss %xmm1, %xmm0, %xmm0
vaddss %xmm0, %xmm0, %xmm0
retq
$ llc -o - badflags.ll -march=x86-64 -mattr=fma -enable-unsafe-fp-math -enable-fmf-dag=1
...
vmulss %xmm1, %xmm0, %xmm2
vfmadd213ss %xmm2, %xmm1, %xmm0 <--- failed to recognize that (a * b) was already calculated
retq
llvm-svn: 244053
Summary: Among other things, this allows -print-after-all/-print-before-all to
dump IR around this pass.
This is the AArch64 version of r243052.
llvm-svn: 244041
return StringSwitch<int>(Flags)
.Case("g", 0x1)
.Case("nzcvq", 0x2)
.Case("nzcvqg", 0x3)
.Default(-1);
...
// The _g and _nzcvqg versions are only valid if the DSP extension is
// available.
if (!Subtarget->hasThumb2DSP() && (Mask & 0x2))
return -1;
ARMARM confirms that the comment is right, and the code was wrong.
llvm-svn: 244029
In r242277, I updated the MachineCombiner to work with itineraries, but I
missed a call that is scheduling-model-only (the opcode-only form of
computeInstrLatency). Using the form that takes an MI* allows this to work with
itineraries (and should be NFC for subtargets with scheduling models).
llvm-svn: 244020
Previously we kept going on partly corrupted input, which might result
in garbage being printed, or even worse, random crashes.
Rafael mentioned that this is the GNU behavior as well, but after some
discussion we both agreed it's probably better to emit a reasonable
error message and exit. As a side-effect of this commit, now we don't
rely on global state for error codes anymore. objdump was the last tool
in the toolchain which needed to be converted. Hopefully the old behavior
won't sneak into the tree again.
llvm-svn: 244019
For example of mingw-w64-g++-4.8.1,
llvm/unittests/ADT/ArrayRefTest.cpp: In member function 'virtual void {anonymous}::ArrayRefTest_AllocatorCopy_Test::TestBody()':
llvm/unittests/ADT/ArrayRefTest.cpp:56:40: internal compiler error: in count_type_elements, at expr.c:5523
} Array3Src[] = {{"hello"}, {"world"}};
^
Please submit a full bug report,
with preprocessed source if appropriate.
llvm-svn: 244017
As documented in the LLVM Coding Standards, indeed MSVC incorrectly asserts
on this in Debug mode. This happens when building clang with Visual C++ and
-triple i686-pc-windows-gnu on these clang regression tests:
clang/test/CodeGen/2011-03-08-ZeroFieldUnionInitializer.c
clang/test/CodeGen/empty-union-init.c
llvm-svn: 243996
Create wrapper methods in the Function class for the OptimizeForSize and MinSize
attributes. We want to hide the logic of "or'ing" them together when optimizing
just for size (-Os).
Currently, we are not consistent about this and rely on a front-end to always set
OptimizeForSize (-Os) if MinSize (-Oz) is on. Thus, there are 18 FIXME changes here
that should be added as follow-on patches with regression tests.
This patch is NFC-intended: it just replaces existing direct accesses of the attributes
by the equivalent wrapper call.
Differential Revision: http://reviews.llvm.org/D11734
llvm-svn: 243994
In the commentary for D11660, I wasn't sure if it was alright to create new
integer machine instructions without also creating the implicit EFLAGS operand.
From what I can see, the implicit operand is always created by the MachineInstrBuilder
based on the instruction type, so we don't have to do that explicitly. However, in
reviewing the debug output, I noticed that the operand was not marked as 'dead'.
The machine combiner should do that to preserve future optimization opportunities
that may be checking for that dead EFLAGS operand themselves.
Differential Revision: http://reviews.llvm.org/D11696
llvm-svn: 243990
Summary:
Previously, we would check whether the target is supported or not, only in
fastSelectInstruction(). This means that 64-bit targets could use FastISel too.
We fix this by checking every overridden method of the FastISel class and
by falling back to SelectionDAG if the target isn't supported. This change
should have been committed along with r243638, but somehow I missed it.
Reviewers: dsanders
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D11755
llvm-svn: 243986
It introduced two regressions on 64-bit big-endian targets running under N32
(MultiSource/Benchmarks/tramp3d-v4/tramp3d-v4, and
MultiSource/Applications/kimwitu++/kc) The issue is that on 64-bit targets
comparisons such as BEQ compare the whole GPR64 but incorrectly tell the
instruction selector that they operate on GPR32's. This leads to the
elimination of i32->i64 extensions that are actually required by
comparisons to work correctly.
There's currently a patch under review that fixes this problem.
llvm-svn: 243984
r243883 and r243961 made a use-after-free far more likely:
http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/6041/steps/check-llvm%20asan/logs/stdio
Unresolved nodes get inserted into the `Cycles` array. If they later
get resolved through RAUW, we need to update the reference. It's
interesting that this never hit before (maybe an asan-ified clang
bootstrap with `-flto -g` would have hit it, but I admit I haven't tried
anything quite that crazy).
llvm-svn: 243976
This change was done as an audit and is by inspection. The new EH
system is still very much a work in progress. NFC for the landingpad
case.
llvm-svn: 243965
r243883 started moving 'distinct' nodes instead of duplicated them in
lib/Linker. This had the side-effect of sometimes not cloning uniqued
nodes that reference them. I missed a corner case:
!named = !{!0}
!0 = !{!1}
!1 = distinct !{!0}
!0 is the entry point for "remapping", and a temporary clone (say,
!0-temp) is created and mapped in case we need to model a uniquing
cycle.
Recursive descent into !1. !1 is distinct, so we leave it alone,
but update its operand to !0-temp.
Pop back out to !0. Its only operand, !1, hasn't changed, so we don't
need to use !0-temp. !0-temp goes out of scope, and we're finished
remapping, but we're left with:
!named = !{!0}
!0 = !{!1}
!1 = distinct !{null} ; uh oh...
Previously, if !0 and !0-temp ended up with identical operands, then
!0-temp couldn't have been referenced at all. Now that distinct nodes
don't get duplicated, that assumption is invalid. We need to
!0-temp->replaceAllUsesWith(!0) before freeing !0-temp.
I found this while running an internal `-flto -g` bootstrap. Strangely,
there was no case of this in the open source bootstrap I'd done before
commit...
llvm-svn: 243961
If we don't have sys/wait.h and we're on a unix system there's no way
that several of the llvm tools work at all. This includes clang.
Just remove the configure and cmake checks entirely - we'll get a
build error instead of building something broken now.
llvm-svn: 243957
On the code path in ExpandUnalignedLoad which expands an unaligned vector/fp
value in terms of a legal integer load of the same size, the ChainResult needs
to be the chain result of the integer load.
No in-tree test case is currently available.
Patch by Jan Hranac!
llvm-svn: 243956
Summary: This patch adds enum value for an existing metadata type -- make.implicit. Using preassigned enum will be helpful to get compile time type checking and avoid string construction and comparison. The patch also changes uses of make.implicit from string metadata to enum metadata. There is no functionality change.
Reviewers: reames
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D11698
llvm-svn: 243954
This adds the software division routines for the Windows RTABI. These are not
expected to be used often though as most modern Windows ARM capable targets
support hardware division. In the case that the target CPU doesnt support
hardware division, this will be the fallback.
llvm-svn: 243952
contained types into the space when we have no contained types. This
fixes the UB stemming from a call to memcpy with a null pointer. This
also reduces the calls to allocate because this actually happens in
a notable client - Clang.
Found by UBSan.
llvm-svn: 243944
Looks like the rebased version that Mehdi committed didn't incorporate
the latest changes.
Patch by Erik de Castro Lopo <erikd@mega-nerd.com>!
llvm-svn: 243942