The alignment is just a byte in the middle of Characteristics, not an
independent flag. Making it an independent field in the yaml
representation makes it more yamlio friendly.
llvm-svn: 181243
Test case by Michele Scandale!
Fixes PR10293: Load not hoisted out of loop with multiple exits.
There are few regressions with this patch, now tracked by
rdar:13817079, and a roughly equal number of improvements. The
regressions are almost certainly back luck because LoopRotate has very
little idea of whether rotation is profitable. Doing better requires a
more comprehensive solution.
This checkin is a quick fix that lacks generality (PR10293 has
a counter-example). But it trivially fixes the case in PR10293 without
interfering with other cases, and it does satify the criteria that
LoopRotate is a loop canonicalization pass that should avoid
heuristics and special cases.
I can think of two approaches that would probably be better in
the long run. Ultimately they may both make sense.
(1) LoopRotate should check that the current header would make a good
loop guard, and that the loop does not already has a sufficient
guard. The artifical SimplifiedLoopLatch check would be unnecessary,
and the design would be more general and canonical. Two difficulties:
- We need a strong guarantee that we won't endlessly rotate, so the
analysis would need to be precise in order to avoid the
SimplifiedLoopLatch precondition.
- Analysis like this are usually based on SCEV, which we don't want to
rely on.
(2) Rotate on-demand in late loop passes. This could even be done by
shoving the loop back on the queue after the optimization that needs
it. This could work well when we find LICM opportunities in
multi-branch loops. This requires some work, and it doesn't really
solve the problem of SCEV wanting a loop guard before the analysis.
llvm-svn: 181230
As pointed out by Rafael Espindola, we should match the DWARF encodings
produced by GCC in both pic and non-pic modes. This was not the case
for the non-pic case.
This patch changes all DWARF encodings to DW_EH_PE_absptr for the
non-pic case, just like GCC does. The test case is updated to check
for both variants.
llvm-svn: 181222
A * (1 - (uitofp i1 C)) -> select C, 0, A
B * (uitofp i1 C) -> select C, B, 0
select C, 0, A + select C, B, 0 -> select C, B, A
These come up in code that has been hand-optimized from a select to a linear blend,
on platforms where that may have mattered. We want to undo such changes
with the following transform:
A*(1 - uitofp i1 C) + B*(uitofp i1 C) -> select C, A, B
llvm-svn: 181216
This patch wires up the SystemZ target in configure, so that it can now be
built using --enable-targets=systemz. It is not yet included in the default
build (--enable-targets=all); this will be done by a follow-up patch.
Patch by Richard Sandiford.
llvm-svn: 181208
This patch adds the necessary configuration bits and #ifdef's to set up
the JIT/MCJIT test cases for SystemZ. Like other recent targets, we do
fully support MCJIT, but do not support the old JIT at all. Set up the
lit config files accordingly, and disable old-JIT unit tests.
Patch by Richard Sandiford.
llvm-svn: 181207
This adds all DebugInfo tests for the SystemZ target.
This version of the patch incorporates feedback from reviews by
Eric Christopher and Rafael Espindola. Thanks to all reviewers!
Patch by Richard Sandiford.
llvm-svn: 181205
This adds all CodeGen tests for the SystemZ target.
This version of the patch incorporates feedback from a review by
Sean Silva. Thanks to all reviewers!
Patch by Richard Sandiford.
llvm-svn: 181204
This adds the actual lib/Target/SystemZ target files necessary to
implement the SystemZ target. Note that at this point, the target
cannot yet be built since the configure bits are missing. Those
will be provided shortly by a follow-on patch.
This version of the patch incorporates feedback from reviews by
Chris Lattner and Anton Korobeynikov. Thanks to all reviewers!
Patch by Richard Sandiford.
llvm-svn: 181203
This is another patch in preparation for adding the SystemZ target.
It defines the appropriate values for DWARF encodings; the intent
is to be compatible with what GCC currently does on the target.
Patch by Richard Sandiford.
llvm-svn: 181201
Several platforms need to disable all old-JIT unit tests, since they only
support the new MCJIT. This currently done via #ifdef'ing out those tests
in the ExecutionEngine/JIT/*.cpp files. As those #ifdef's have grown
historically, we now have a number of repeated directives which -in total-
cover nearly the whole file, but leave a couple of helper functions out.
When building the tests with clang itself, those helper functions now
cause spurious "unused function" warnings.
To fix those warnings, and also to remove the duplicate #ifdef conditions
and make it easier to disable the tests for a new target, this patch
consolidates the #ifdefs into a single one per file, which covers all
the tests including all helper routines.
Tested on PowerPC and SystemZ.
llvm-svn: 181200
As pointed out by Evgeniy Stepanov, assigning a std::string temporary
to a StringRef is not a good idea. Rework MatchRegisterName to avoid
using the .lower routine.
llvm-svn: 181192
We used to disable constant merging not only if a constant is llvm.used, but
also if an alias of a constant is llvm.used. This change fixes that.
llvm-svn: 181175
This gets exception handling working on ELF and Macho (x86-64 at least).
Other than the EH frame registration, this patch also implements support
for GOT relocations which are used to locate the personality function on
MachO.
llvm-svn: 181167
indirect branch at the end of the BB. Otherwise if-converter, branch folding
pass may incorrectly update its successor info if it consider BB as fallthrough
to the next BB.
rdar://13782395
llvm-svn: 181161
Now even the small structures could be passed within byval (small enough
to be stored in GPRs).
In regression tests next function prototypes are checked:
PR15293:
%artz = type { i32 }
define void @foo(%artz* byval %s)
define void @foo2(%artz* byval %s, i32 %p, %artz* byval %s2)
foo: "s" stored in R0
foo2: "s" stored in R0, "s2" stored in R2.
Next AAPCS rules are checked:
5.5 Parameters Passing, C.4 and C.5,
"ParamSize" is parameter size in 32bit words:
-- NSAA != 0, NCRN < R4 and NCRN+ParamSize > R4.
Parameter should be sent to the stack; NCRN := R4.
-- NSAA != 0, and NCRN < R4, NCRN+ParamSize < R4.
Parameter stored in GPRs; NCRN += ParamSize.
llvm-svn: 181148
X86ISelLowering has support to treat:
(icmp ne (and (xor %flags, -1), (shl 1, flag)), 0)
as if it were actually:
(icmp eq (and %flags, (shl 1, flag)), 0)
However, r179386 has code at the InstCombine level to handle this.
llvm-svn: 181145
Add support for min/max reductions when "no-nans-float-math" is enabled. This
allows us to assume we have ordered floating point math and treat ordered and
unordered predicates equally.
radar://13723044
llvm-svn: 181144
Add support for matching 'ordered' and 'unordered' floating point min/max
constructs.
In LLVM we can express min/max functions as a combination of compare and select.
We have support for matching such constructs for integers but not for floating
point. In floating point math there is no total order because of the presence of
'NaN'. Therefore, we have to be careful to preserve the original fcmp semantics
when interpreting floating point compare select combinations as a minimum or
maximum function. The resulting 'ordered/unordered' floating point maximum
function has to select the same value as the select/fcmp combination it is based
on.
ordered_max(x,y) = max(x,y) iff x and y are not NaN, y otherwise
unordered_max(x,y) = max(x,y) iff x and y are not NaN, x otherwise
ordered_min(x,y) = min(x,y) iff x and y are not NaN, y otherwise
unordered_min(x,y) = min(x,y) iff x and y are not NaN, x otherwise
This matches the behavior of the underlying select(fcmp(olt/ult/.., L, R), L, R)
construct.
Any code using this predicate has to preserve this semantics.
A follow-up patch will use this to implement floating point min/max reductions
in the vectorizer.
radar://13723044
llvm-svn: 181143
This is about the simplest relocation, but surprisingly rare in actual
code.
It occurs in (for example) the MCJIT test test-ptr-reloc.ll.
llvm-svn: 181134
As with global accesses, external functions could exist anywhere in
memory. Therefore the stub must create a complete 64-bit address. This
patch implements the fragment as (roughly):
movz x16, #:abs_g3:somefunc
movk x16, #:abs_g2_nc:somefunc
movk x16, #:abs_g1_nc:somefunc
movk x16, #:abs_g0_nc:somefunc
br x16
In principle we could save 4 bytes by using a literal-load instead,
but it is unclear that would be more efficient and can only be tested
when real hardware is readily available.
This allows (for example) the MCJIT test 2003-05-07-ArgumentTest to
pass on AArch64.
llvm-svn: 181133
The large memory model (default and main viable for JIT) emits
addresses in need of relocation as
movz x0, #:abs_g3:somewhere
movk x0, #:abs_g2_nc:somewhere
movk x0, #:abs_g1_nc:somewhere
movk x0, #:abs_g0_nc:somewhere
To support this we must implement those four relocations in the
dynamic loader.
This allows (for example) the test-global.ll MCJIT test to pass on
AArch64.
llvm-svn: 181132
R_AARCH64_PCREL32 is present in even trivial .eh_frame sections and so
is required to compile any function without the "nounwind" attribute.
This change implements very basic infrastructure in the RuntimeDyldELF
file and allows (for example) the test-shift.ll MCJIT test to pass
on AArch64.
llvm-svn: 181131
AArch64 is going to need some kind of cache-invalidation in order to
successfully JIT since it has a weak memory-model. This is provided by
a __clear_cache builtin in libgcc, which acts very much like the
32-bit ARM equivalent (on platforms where it exists).
llvm-svn: 181129
This let us to remove some custom code that matched constant offsets
from globals at instruction selection time as a special addressing mode.
No intended functionality change.
llvm-svn: 181126
The code now makes use of ComputeMaskedBits,
SelectionDAG::isBaseWithConstantOffset and TargetLowering::isGAPlusOffset
where appropriate reducing the amount of logic needed in XCoreISelLowering.
No intended functionality change.
llvm-svn: 181125
Thread local storage is not supported by the XMOS linker so we handle
thread local variables by lowering the variable to an array of n elements
(where n is the number of hardware threads per core, currently 8
for all XMOS devices) indexed by the the current thread ID.
Previously this lowering was spread across the XCoreISelLowering and the
XCoreAsmPrinter classes. Moving this to a separate pass should be much
cleaner.
llvm-svn: 181124
The MOVZ/MOVK instruction sequence may not be the most efficient (a
literal-pool load could be better) but adding that would require
reinstating the ConstantIslands pass.
For now the sequence is correct, and that's enough. Beware, as of
commit GNU ld does not appear to support the relocations needed for
this. Its primary purpose (for now) will be to support JITed code,
since in that case there is no guarantee of where your code will end
up in memory relative to external symbols it references.
llvm-svn: 181117
The intended semantics mirror autoconf, where the user is able to
specify a host triple, but if it's left to the build system then
"config.guess" is invoked for the default.
This also renames the LLVM_HOSTTRIPLE define to LLVM_HOST_TRIPLE to
fit in with the style of the surrounding defines.
llvm-svn: 181112
Now that we hava a convinient place to keep it, remeber the set of
identified structs as we merge modules.
This speeds up the linking of all the bitcode files in clang with the
gold plugin and -plugin-opt=emit-llvm (i.e., link only, no codegen) from
5:25 minutes to 13.6 seconds!
Patch by Xiaofei Wan!
llvm-svn: 181104
Update comments, fix * placement, fix method names that are not
used in clang, add a linkInModule that takes a Mode and put it
in Linker.cpp.
llvm-svn: 181099