Fast ISel isn't able to handle 'insertvalue' and it causes a large slowdown
during -O0 compilation. We don't necessarily need to generate an aggregate of
the values here if they're just going to be extracted directly afterwards.
<rdar://problem/10530851>
llvm-svn: 146481
test cases where there were a lot of relocations applied relative to a large
rodata section. Gas would create a symbol for each of these whereas we would
be relative to the beginning of the rodata section. This change mimics what
gas does.
Patch by Jack Carter.
llvm-svn: 146468
of the targets we know about. Because this is cached, rebuilds won't
detect when new targets show up. It's also a bit simpler to just say
"all". If users want to restrict the target set, they can still do so,
and then the cache will preserve what they have explicitly set this
field to.
llvm-svn: 146467
undefined result. This adds new ISD nodes for the new semantics,
selecting them when the LLVM intrinsic indicates that the undef behavior
is desired. The new nodes expand trivially to the old nodes, so targets
don't actually need to do anything to support these new nodes besides
indicating that they should be expanded. I've done this for all the
operand types that I could figure out for all the targets. Owners of
various targets, please review and let me know if any of these are
incorrect.
Note that the expand behavior is *conservatively correct*, and exactly
matches LLVM's current behavior with these operations. Ideally this
patch will not change behavior in any way. For example the regtest suite
finds the exact same instruction sequences coming out of the code
generator. That's why there are no new tests here -- all of this is
being exercised by the existing test suite.
Thanks to Duncan Sands for reviewing the various bits of this patch and
helping me get the wrinkles ironed out with expanding for each target.
Also thanks to Chris for clarifying through all the discussions that
this is indeed the approach he was looking for. That said, there are
likely still rough spots. Further review much appreciated.
llvm-svn: 146466
Constant pool entries with different alignment may cause more alignment
padding to be inserted. Compute the amount of padding needed, and try to
pick the location that requires the least amount of padding.
Also take the extra padding into account when the water is above the
use.
llvm-svn: 146458
This should always be done as a matter of principal. I don't have a
case that exposes the problem. I just noticed this recently while
scanning the code and realized I meant to fix it long ago.
llvm-svn: 146438
subdirectories to traverse into.
- Originally I wanted to avoid this and just autoscan, but this has one key
flaw in that new subdirectories can not automatically trigger a rerun of the
llvm-build tool. This is particularly a pain when switching back and forth
between trees where one has added a subdirectory, as the dependencies will
tend to be wrong. This will also eliminates FIXME implicitly.
llvm-svn: 146436
If we create new intervals for a variable that is being spilled, then those new intervals are not guaranteed to also spill. This means that anything reading from the original spilling value might not get the correct value if spills were missed.
Fixes <rdar://problem/10546864>
llvm-svn: 146428
These modifiers simply select either the low or high D subregister of a Neon
Q register. I've also removed the unimplemented 'p' modifier, which turns out
to be a bit different than the comment here suggests and as far as I can tell
was only intended for internal use in Apple's version of gcc.
llvm-svn: 146417
detected in the forward-CFG DFS. This prevents the reverse-CFG from
visiting blocks inside loops after blocks that dominate them in the
case where loops have multiple exits.
No testcase, because this fixes a bug which in practice only shows
up in a full optimizer run, due to the use-list order.
This fixes rdar://10422791 and others.
llvm-svn: 146408
Downgrade the alignment of the initial constant island when constant
pool entries are moved elsewhere.
This is all gated by -arm-align-constant-islands.
llvm-svn: 146391
Order constant pool entries by descending alignment in the initial
island to ensure packing and correct alignment. When the command line
flag is set, also align the basic block containing the constant pool
entries.
This is only a partial implementation of constant island alignment. More
to come.
llvm-svn: 146375
CMake versions 2.8.4 and earlier were giving this error since r146323:
"string end index: -1 is out of range 0 - 6"
Passing -1 as the length of the desired substring was a new feature
added in CMake 2.8.5:
http://www.cmake.org/Bug/view.php?id=10740
llvm-svn: 146372
I followed three heuristics for deciding whether to set 'true' or
'false':
- Everything target independent got 'true' as that is the expected
common output of the GCC builtins.
- If the target arch only has one way of implementing this operation,
set the flag in the way that exercises the most of codegen. For most
architectures this is also the likely path from a GCC builtin, with
'true' being set. It will (eventually) require lowering away that
difference, and then lowering to the architecture's operation.
- Otherwise, set the flag differently dependending on which target
operation should be tested.
Let me know if anyone has any issue with this pattern or would like
specific tests of another form. This should allow the x86 codegen to
just iteratively improve as I teach the backend how to differentiate
between the two forms, and everything else should remain exactly the
same.
llvm-svn: 146370
intrinsic syntax.
Now that this is explicitly covered, I plan to upgrade the existing test
suite to use an explicit immediate. Note that I plan to specify 'true'
in most places rather than the auto-upgraded value as that is the far
more common value to end up here as that is the value coming from GCC's
builtins. The only place I'm likely to put a 'false' in is when testing
x86 which actually has different instructions for the two variants.
llvm-svn: 146369
the behavior with the newly added flag for undefined results on a zero
input.
I'm terrible at documentation, so comments and suggestions welcome here.
llvm-svn: 146361
indicates whether the intrinsic has a defined result for a first
argument equal to zero. This will eventually allow these intrinsics to
accurately model the semantics of GCC's __builtin_ctz and __builtin_clz
and the X86 instructions (prior to AVX) which implement them.
This patch merely sets the stage by extending the signature of these
intrinsics and establishing auto-upgrade logic so that the old spelling
still works both in IR and in bitcode. The upgrade logic preserves the
existing (inefficient) semantics. This patch should not change any
behavior. CodeGen isn't updated because it can use the existing
semantics regardless of the flag's value.
Note that this will be followed by API updates to Clang and DragonEgg.
Reviewed by Nick Lewycky!
llvm-svn: 146357
The OptLevel is now redundant with the TargetMachine*.
And selectTarget() isn't really JIT-specific and could probably
get refactored into one of the lower level libraries.
llvm-svn: 146355
in CMake a bit more handy. Previously we would get such charming
versions as the following for revision NNNN and commit-ish XXXXX:
3.1svnsvn-rNNNN
3.1svngit-svn-rNNNN
3.1svngit-svn-XXXXX
The mechanism selecting betwene the latter two was particularly odd, and
didn't work with all of the ways git-svn repos are set up apparently. It
also misses an important point -- both the revision *and* the git commit
might be relevant when working on a local branch some distance from
mainline. The new logic does several things:
1) It strips the redundant initial 'svn'.
2) It always looks for a git-svn revision number base, and when found
includes it in the version.
3) If the git commit-ish for the current HEAD is not exactly that
revision number, it is also included.
The resulting strings should roughly be:
3.1svn-rNNNN
3.1git-svn-rNNNN
3.1git-svn-rNNNN-XXXXX
Suggestions on formatting etc always welcome. =] I've only looked at the
LLVM version string here, not Clang's (yet).
Note that the commit-ish reported is *not* terribly accurate. It updates
when 'cmake' is run, not when the binary is built. Still, it may be
better than nothing, especially if people have fairly long-lived git
repos and branches. This is not a new limitation, just didn't want
anyone to be surprised.
llvm-svn: 146323
The split point is picked such that the newly created water has the same
alignment as the function. This makes the island suitable for constant
pool entries with potentially higher alignment.
This also fixes an issue where the basic block was split one instruction
too late, causing nonconvergence of the algorithm.
<rdar://problem/10550705>
There is still an issue with correctly packing differently aligned
entries in the island.
llvm-svn: 146314
does. The _GLOBAL_OFFSET_TABLE_ is still magical in that we get a R_386_GOTPC,
but it doesn't change the immediate in the same way as when the expression
has no right hand side symbol.
llvm-svn: 146311
Since we're not rewriting IVs in other loops, there's not much reason
to consider their stride when generating formulae.
This should reduce the number of useless formulas considered by LSR.
llvm-svn: 146302
Backwards compatibility with 'gas'. #imm is the preferered and documented
syntax, but lots of existing code uses the '$' prefix, so we should
support it if we can.
llvm-svn: 146285
When the immediate operand of an AND or BIC instruction isn't representable
in the immediate field of the instruction, but the bitwise negation of the
immediate is, assemble the instruction as the inverse operation instead
with the inverted immediate as the operand.
rdar://10550057
llvm-svn: 146283
Refactor the instructions into fixed writeback and register-stride
writeback variants to simplify the offset operand (no more optional
register operand using reg0). This is a simpler representation and allows
the assembly parser to more easily handle these instructions.
Add tests for the instruction variants now supported.
llvm-svn: 146278
generates the dwarf Compile Unit DIE and a dwarf subprogram DIE for each
non-temporary label.
The next part will be to get the clang driver to enable this when assembling
a .s file. rdar://9275556
llvm-svn: 146262
Patch by Brendon Cahoon!
This extends the existing LoopUnroll and LoopUnrollPass. Brendon
measured no regressions in the llvm test suite with -unroll-runtime
enabled. This implementation works by using the existing loop
unrolling code to unroll the loop by a power-of-two (default 8). It
generates an if-then-else sequence of code prior to the loop to
execute the extra iterations before entering the unrolled loop.
llvm-svn: 146245
MipsTargetLowering::LowerGlobalTLSAddress. This is necessary to have
call16(__tls_get_addr) emitted instead of got_disp(__tls_get_addr) when the
target is Mips64.
llvm-svn: 146183
- Modify lowering of global TLS address nodes.
- Modify isel of ThreadPointer.
- Wrap target global TLS address nodes that are operands of loads with WrapperPIC.
- Remove Mips-specific DAG nodes TlsGd, TprelHi and TprelLo, which can be
substituted with other existing nodes.
llvm-svn: 146175
clients to decide whether to look inside bundled instructions and whether
the query should return true if any / all bundled instructions have the
queried property.
llvm-svn: 146168
if (HasAVX)
X86SSELevel = NoMMXSSE;
This is so patterns that are predicated on hasSSE3, etc. would not be selected when avx is available. Instead, the AVX variant is selected.
However, this breaks instructions which do not have AVX variants.
The right way to fix this is for the SSE but not-AVX patterns to predicate on something like hasSSE3() && !hasAVX().
Then we can take out the hack in X86Subtarget.cpp. Patterns which do not have AVX variants do not need to change.
However, we need to audit all the patterns before we make the change. This patch is workaround that fixes one specific case,
the prefetch instructions. rdar://10538297
llvm-svn: 146163
We must not issue a bitcast operation for integer-promotion of vector types, because the
location of the values in the vector may be different.
llvm-svn: 146150
It is not used any more. We are tracking inline assembly misalignments
directly through the BBInfo.Unalign and KnownBits fields.
A simple conservative size estimate is not good enough since it can
cause alignment padding to be underestimated.
llvm-svn: 146124
Compute alignment padding before and after basic blocks dynamically.
Heed basic block alignment.
This simplifies bookkeeping because we don't have to constantly add and
remove padding from BBInfo.Size. It also makes it possible to track the
extra known alignment bits we get after a tBR_JTr terminator and when
entering an aligned basic block.
This makes the ARMConstantIslandPass aware of aligned basic blocks.
It is tricky to model block alignment correctly when dealing with inline
assembly and tBR_JTr instructions that have variable size. If inline
assembly turns out to be smaller than expected, that may cause following
alignment padding to be larger than expected. This could cause constant
pool entries to move out of range.
To avoid that problem, we use the worst case alignment padding following
inline assembly. This may cause slightly suboptimal constant island
placement in aligned basic blocks following inline assembly. Normal
functions should be unaffected.
llvm-svn: 146118
files. First, add a new block USELIST_BLOCK to the bitcode format. This is
where USELIST_CODE_ENTRYs will be stored. The format of the USELIST_CODE_ENTRYs
have not yet been defined. Add support in the BitcodeReader for parsing the
USELIST_BLOCK.
Part of rdar://9860654 and PR5680.
llvm-svn: 146078