Summary:
The existing code was correct for 32-bit GPR's but not 64-bit GPR's. It now
accounts for both cases.
Reviewers: vkalintiris
Reviewed By: vkalintiris
Subscribers: llvm-commits, mohit.bhakkad, sagar
Differential Revision: http://reviews.llvm.org/D9337
llvm-svn: 236099
Summary:
Do the assemble-time shifts from createLShiftOri at the source, which groups all the shifting together, closer to the main logic path, and
store the results in concisely-named variables to improve code clarity.
Reviewers: dsanders
Reviewed By: dsanders
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D8973
llvm-svn: 236096
It doesn't have a maintainer and none of the release testers test it,
so I don't think it should be part of the release.
http://reviews.llvm.org/D9331
llvm-svn: 236077
We were trying to look through COPY instructions, but only to the next
instruction in a BB and incorrectly anyway. The cases where that would actually
be a good idea are rare enough (and not even tested!) that it's not worth
trying to get right.
rdar://20721342
llvm-svn: 236050
This is a compromise: with this simple patch, we should always handle a chain of exactly 3
operations optimally, but we're not generating the optimal balanced binary tree for a longer
sequence.
In general, this transform will reduce the dependency chain for a sequence of instructions
using N operands from a worst case N-1 dependent operations to N/2 dependent operations.
The optimal balanced binary tree would reduce the chain to log2(N).
The trade-off for not dealing with longer sequences is: (1) we have less complexity in the
compiler, (2) we avoid unknown compile-time blowup calculating a balanced tree, and (3) we
don't need to worry about the increased register pressure required to parallelize longer
sequences. It also seems unlikely that we would ever encounter really long strings of
dependent ops like that in the wild, but I'm not sure how to verify that speculation.
FWIW, I see no perf difference for test-suite running on btver2 (x86-64) with -ffast-math
and this patch.
We can extend this patch to cover other associative operations such as fmul, fmax, fmin,
integer add, integer mul.
This is a partial fix for:
https://llvm.org/bugs/show_bug.cgi?id=17305
and if extended:
https://llvm.org/bugs/show_bug.cgi?id=21768https://llvm.org/bugs/show_bug.cgi?id=23116
The issue also came up in:
http://reviews.llvm.org/D8941
Differential Revision: http://reviews.llvm.org/D9232
llvm-svn: 236031
Summary:
We don't seem to need to assert here, since this function's callers expect
to get a nullptr on error. This way we don't assert on user input.
Bug found with AFL fuzz.
Reviewers: rafael
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D9308
llvm-svn: 236027
We don't need codegen-only intrinsic instructions for the vector forms of these instructions.
This makes the reciprocal estimate instruction lowering identical to how we handle normal
square roots: (V)SQRTPS / (V)SQRTPD.
No existing regression tests fail with this patch.
Differential Revision: http://reviews.llvm.org/D9301
llvm-svn: 236013
Fixes a crash with basically any OpenGL application using the radeonsi
driver.
Patch by: Michel Dänzer
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90176
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
llvm-svn: 236004
llc converts all feature strings to lower case, while the LLVM C API
does not, so we need a lower case alias in order to test this with llc.
llvm-svn: 236003
We need to track if an AddrSpaceCast expression was seen when
generating an MCExpr for a ConstantExpr. This change introduces a
custom lowerConstant method to the NVPTX asm printer that will create
NVPTXGenericMCSymbolRefExpr nodes at the appropriate places to encode
the information that a given symbol needs to be casted to a generic
address.
llvm-svn: 236000
This is a preliminary step to using the IR-level floating-point fast-math-flags in the SDAG (D8900).
In this patch, we introduce the optimization flags as their own struct. As noted in the TODO comment,
we should eventually share this data between the IR passes and the backend.
We also switch the existing nsw / nuw / exact bit functionality of the BinaryWithFlagsSDNode class to
use the new struct.
The tradeoff is that instead of using the free but limited space of SDNode's SubclassData, we add a
data member to the subclass. This means we don't have to repeat all of the get/set methods per flag,
but we're potentially adding size to all nodes of this subclassi type.
In practice on 64-bit systems (measured on Linux and MacOS X), there is no size difference between an
SDNode and BinaryWithFlagsSDNode after this change: they're both 80 bytes. This means that we had at
least one free byte to play with due to struct alignment.
Differential Revision: http://reviews.llvm.org/D9325
llvm-svn: 235997
Summary: If the immediate is 0, the ORi is pointless.
Reviewers: dsanders
Reviewed By: dsanders
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D8969
llvm-svn: 235990
[DebugInfo] Add debug locations to constant SD nodes
This adds debug location to constant nodes of Selection DAG and updates
all places that create constants to pass debug locations
(see PR13269).
Can't guarantee that all locations are correct, but in a lot of cases choice
is obvious, so most of them should be. At least all tests pass.
Tests for these changes do not cover everything, instead just check it for
SDNodes, ARM and AArch64 where it's easy to get incorrect locations on
constants.
This is not complete fix as FastISel contains workaround for wrong debug
locations, which drops locations from instructions on processing constants,
but there isn't currently a way to use debug locations from constants there
as llvm::Constant doesn't cache it (yet). Although this is a bit different
issue, not directly related to these changes.
Differential Revision: http://reviews.llvm.org/D9084
llvm-svn: 235989
Summary: The new name is more accurate with regard to the functionality.
Reviewers: dsanders
Reviewed By: dsanders
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D8968
llvm-svn: 235984
Summary: This removes multiple calls to getReg() and saves us column space in the source file.
Reviewers: dsanders
Reviewed By: dsanders
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D8924
llvm-svn: 235978
This adds debug location to constant nodes of Selection DAG and updates
all places that create constants to pass debug locations
(see PR13269).
Can't guarantee that all locations are correct, but in a lot of cases choice
is obvious, so most of them should be. At least all tests pass.
Tests for these changes do not cover everything, instead just check it for
SDNodes, ARM and AArch64 where it's easy to get incorrect locations on
constants.
This is not complete fix as FastISel contains workaround for wrong debug
locations, which drops locations from instructions on processing constants,
but there isn't currently a way to use debug locations from constants there
as llvm::Constant doesn't cache it (yet). Although this is a bit different
issue, not directly related to these changes.
Differential Revision: http://reviews.llvm.org/D9084
llvm-svn: 235977
As a space optimization, this instruction would just encode the pointer
type of the first operand and use the knowledge that the second and
third operands would be of the pointee type of the first. When typed
pointers go away, this assumption will no longer be available - so
encode the type of the second operand explicitly and rely on that for
the third.
Test case added to demonstrate the backwards compatibility concern,
which only comes up when the definition of the second operand comes
after the use (hence the weird basic block sequence) - at which point
the type needs to be explicitly encoded in the bitcode and the record
length changes to accommodate this.
llvm-svn: 235966
This matches other assemblers and is less unexpected (e.g. PR23227).
On ELF, I tried binutils gas v2.24 and nasm 2.10.09, and they both
agree on LShr. On COFF, I couldn't get my hands on an assembler yet,
so don't change the behavior. For now, don't change it on non-AArch64
Darwin either, as the other assembler is gas v1.38, which does an AShr.
llvm-svn: 235963
Support up to 2^16 arguments to a function. If we do hit the limit,
assert out rather than restarting at 0 as we've done historically.
This fixes PR23332. A clang test will follow.
llvm-svn: 235955
Defaulting to AShr without consulting the target MCAsmInfo isn't OK.
Add a flag to fix that. Keep it off for now: target migrations will
follow in separate commits.
llvm-svn: 235951
I previously thought switch clusters would need to use uint64_t in case
the weights of multiple cases overflowed a 32-bit int. It turns
out that the weights on a terminator instruction are capped to allow for
being added together, so using a uint32_t should be safe.
llvm-svn: 235945
Reverse libLTO's default behaviour for preserving use-list order in
bitcode, and add API for controlling it. The default setting is now
`false` (don't preserve them), which is consistent with `clang`'s
default behaviour.
Users of libLTO should call `lto_codegen_should_embed_uselists(CG,true)`
prior to calling `lto_codegen_write_merged_modules()` whenever the
output file isn't part of the production workflow in order to reproduce
results with subsequent calls to `llc`.
(I haven't added tests since `llvm-lto` (the test tool for LTO) doesn't
support bitcode output, and even if it did: there isn't actually a good
way to test whether a tool has passed the flag. If the order is already
"natural" (if the order will already round-trip) then no use-list
directives are emitted at all. At some point I'll circle back to add
tests to `llvm-as` (etc.) that they actually respect the flag, at which
point I can somehow add a test here as well.)
llvm-svn: 235943
Previously, the code would try to put a fall-through case last,
even if that meant moving a case with much higher branch weight
further down the chain.
Ordering by branch weight is most important, putting a fall-through
block last is secondary.
llvm-svn: 235942
These look like copy/paste errors, and shouldn't have the "prior to"
qualifier. Each API was introduced at the given values of
`LTO_API_VERSION`. The "prior to" in other doxygen comments is because
I couldn't easily differentiate between versions 1 and 2 when I added
these comments.
llvm-svn: 235925
After legalization, scalar SETCC has an i32 result type on AArch64.
The i1 requirement seems too conservative, replace it with an assert.
This also means that we now can run after legalization. That should also
be fine, since the ops legalizer runs again after each combine, and
all types created all have the same sizes as the (legal) inputs.
Exposed by r235917; while there, robustize its tests (bsl also uses the
register it defines).
llvm-svn: 235922
The gold binary is not required to build the plugin. All that is
needed is for LLVM_BINUTILS_INCDIR to point to the directory
containing plugin-api.h.
llvm-svn: 235918
When the setcc has f64 operands, we can't build a vector setcc mask
to feed a vselect, because f64 doesn't divide v3f32 evenly.
Just bail out when that happens.
llvm-svn: 235917
We need to dereference the signals mutex during handler registration so that we force its construction. This is to prevent the first use being during handling an actual signal because you can't safely allocate memory in a signal handler.
llvm-svn: 235914
Use a few extra bits in the const field (after widening it from a fixed
single bit) to stash the address space which is no longer provided by
the type (and an extra bit in there to specify that we're using that new
encoding).
llvm-svn: 235911
This patch adds a new SSA MI pass that runs on little-endian PPC64
code with VSX enabled. Loads and stores of 4x32 and 2x64 vectors
without alignment constraints are accomplished for little-endian using
lxvd2x/xxswapd and xxswapd/stxvd2x. The existence of the additional
xxswapd instructions hurts performance in comparison with big-endian
code, but they are necessary in the general case to support correct
semantics.
However, the general case does not apply to most vector code. Many
vector instructions are lane-insensitive; they do not "care" which
lanes the parallel computations are performed within, provided that
the resulting data is stored into the correct locations. Thus this
pass looks for computations that perform only lane-insensitive
operations, and remove the unnecessary swaps from loads and stores in
such computations.
Future improvements will allow computations using certain
lane-sensitive operations to also be optimized in this manner, by
modifying the lane-sensitive operations to account for the permuted
order of the lanes. However, this patch only adds the infrastructure
to permit this; no lane-sensitive operations are optimized at this
time.
This code is heavily exercised by the various vectorizing applications
in the projects/test-suite tree. For the time being, I have only added
one simple test case to demonstrate what the pass is doing. Although
it is quite simple, it provides coverage for much of the code,
including the special case handling of copies and subreg-to-reg
operations feeding the swaps. I plan to add additional tests in the
future as I fill in more of the "special handling" code.
Two existing tests were affected, because they expected the swaps to
be present, but they are now removed.
llvm-svn: 235910
Use a loop instruction with a constant extender for a hardware
loop instruction that is too far away from the start of the loop.
This is cheaper than changing the SA register value.
Differential Revision: http://reviews.llvm.org/D9262
llvm-svn: 235882
Summary:
Changed the warning message to show the current value of $at, similar to what clang does for typedef's, and renamed warnIfAssemblerTemporary to a more descriptive name.
I also changed the type of variables which store registers from int to unsigned, updated the relevant test and tried to make the related comments clearer.
Reviewers: dsanders
Reviewed By: dsanders
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D8479
llvm-svn: 235881
This reapplies r235194, which was reverted in r235495 because it was causing a
failure in our out-of-tree buildbots for MIPS. With the sign-extension patch
in r235718, this patch doesn't cause any problem any more.
llvm-svn: 235878
Summary:
When used, it is substituted with the number of .macro instantiations we've done up to that point in time.
So if this is the 1st time we've instantiated a .macro (any .macro, regardless of name), \@ will instantiate to 0, if it's the 2nd .macro instantiation, it will instantiate to 1 etc.
It can only be used inside a .macro definition, an .irp definition or an .irpc definition (those last 2 uses are undocumented).
Reviewers: echristo, rafael
Reviewed By: rafael
Subscribers: dsanders, llvm-commits
Differential Revision: http://reviews.llvm.org/D9197
llvm-svn: 235862
Summary:
This patch adds constant folding of insertelement instruction to undef value when index operand is constant and is not less than vector size or is undef.
InstCombine does not support this case, but I'm happy to add it there also if this change is accepted.
Test Plan: Unittests and regression tests for ConstProp pass.
Reviewers: majnemer
Reviewed By: majnemer
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D9287
llvm-svn: 235854
Patch to allow int8 vectors to be multiplied on the SSE unit instead of being scalarized.
The patch sign extends the i8 lanes to i16, uses the SSE2 pmullw multiplication instruction, then packs the lower byte from each result.
Differential Revision: http://reviews.llvm.org/D9115
llvm-svn: 235837
1) Turns out we're not great at recognizing redundant checks when one is a != and the other is an ==. This is a bug, but it's one that matters to frontend authors.
2) Frontends shouldn't use intrinsics unless strictly neccessary. This has been pretty widely proven by this point and is good to document.
llvm-svn: 235825
Looking into 23095, my best guess is that the CodeGen library itself isn't getting linked and initialized properly. To make this slightly more obvious to consumers of LLVM, emit a different error message if we can tell that the registry is empty vs you've simply happened to name a collector which hasn't been registered.
llvm-svn: 235824
There can be various constant pointers in the IR which do not get relocated at a safepoint. One example is the address of a global variable. Another example is a pointer created via inttoptr. Note that the optimizer itself likes to create such inttoptrs when locally propagating constants through dynamically dead code.
To deal with this, we need to exclude uses of constants from contributing to the liveness of a safepoint which might reach that use. At some later date, it might be worth exploring what could be done to support the relocation of various special types of "constants", but that's future work.
Differential Revision: http://reviews.llvm.org/D9236
llvm-svn: 235821
llvm.frameescape() intrinsic is not a real call. The intrinsic can only exist in the entry block. Inserting a gc.statepoint() before llvm.frameescape() may split the entry block, and push the intrinsic out of the entry block.
Patch by: Swaroop.Sridhar@microsoft.com
Differential Revision: http://reviews.llvm.org/D8910
llvm-svn: 235820
This is a follow-on to D8833 (insertps optimization when the zero mask is not used).
In this patch, we check for the case where the zmask is used, but both input vectors
to the insertps intrinsic are the same operand or the zmask overrides the destination
lane. This lets us replace the 2nd shuffle input operand with the zero vector.
Differential Revision: http://reviews.llvm.org/D9257
llvm-svn: 235810
Add serialization support for function metadata attachments (added in
r235783). The syntax is:
define @foo() !attach !0 {
Metadata attachments are only allowed on functions with bodies. Since
they come before the `{`, they're not really part of the body; since
they require a body, they're not really part of the header. In
`LLParser` I gave them a separate function called from `ParseDefine()`,
`ParseOptionalFunctionMetadata()`.
In bitcode, I'm using the same `METADATA_ATTACHMENT` record used by
instructions. Instruction metadata attachments are included in a
special "attachment" block at the end of a `Function`. The attachment
records are laid out like this:
InstID (KindID MetadataID)+
Note that these records always have an odd number of fields. The new
code takes advantage of this to recognize function attachments (which
don't need an instruction ID):
(KindID MetadataID)+
This means we can use the same attachment block already used for
instructions.
This is part of PR23340.
llvm-svn: 235785
Add a verifier check that only functions with bodies have metadata
attachments. This should help catch bugs in frontends and
transformation passes. Part of PR23340.
llvm-svn: 235784
Add IR support for `Metadata` attachments. Assembly and bitcode support
will follow shortly, but for now we just have unit tests. This is part
of PR23340.
llvm-svn: 235783
Remove unused `PFS` variable and take the `Instruction` by reference.
(Not really related to PR23340, but might as well clean this up while
I'm here.)
llvm-svn: 235782
right scaling.
In the function canFoldInAddressingMode, VT is computed as the type of the
destination/source of a LOAD/STORE operations, instead of the memory type of the
operation.
On targets with a scaling factor on the offset of the LOAD/STORE operations, the
function may return false for actually valid cases. This may then prevent the
selection of profitable pre or post indexed load/store operations, and instead
select pre or post indexed load/store for unprofitable cases.
Patch by Francois de Ferriere <francois.de-ferriere@st.com>!
Differential Revision: http://reviews.llvm.org/D9146
llvm-svn: 235780
Parameterize the separator for attachments, since `Function` metadata
attachments (PR23340) aren't going to use a `,` (comma). No real
functionality change.
llvm-svn: 235775
Collect metadata names once per `AssemblyWriter` instead of every time
we need to print some attachments. Just a drive-by; this caught my eye
while I was refactoring the code in r235772.
llvm-svn: 235774
When using bit tests for hole checks, we call AddPredecessorToBlock to give the
phi node a value from the bit test block. This would break if we've
previously called removePredecessor on the default destination because the
switch is fully covered.
Test case by Mark Lacey.
llvm-svn: 235771
Make room for more than just `Function::isMaterializable()` in the
`GlobalObject` subclass data bitfield. Since we're treating it like a
bitfield, change `Function::Function()` to zero-out the whole thing.
llvm-svn: 235770
Extract the set logic for metadata attachments from `Instruction` so it
can be reused for `Function` (PR23340).
This data structure makes a `SmallVector<>` look (a little) like a map,
just doing the bare minimum to support the `Instruction` (and soon,
`Function`) metadata API.
llvm-svn: 235769
This introduces an intrinsic called llvm.eh.exceptioncode. It is lowered
by copying the EAX value live into whatever basic block it is called
from. Obviously, this only works if you insert it late during codegen,
because otherwise mid-level passes might reschedule it.
llvm-svn: 235768
Technically the operations are different -- the old logic moved items
from the back into the opened-up slots, instead of the usual
`remove_if()` logic of a slow and a fast iterator -- but unless a
profile tells us otherwise I prefer the simpler logic here. Regardless,
there shouldn't be an observable function change.
llvm-svn: 235767
Same as r235145 for the call instruction - the justification, tradeoffs,
etc are all the same. The conversion script worked the same without any
false negatives (after replacing 'call' with 'invoke').
llvm-svn: 235755
In CMake dependencies can be filenames or targets, and targets can't be filenames. The Ninja generator handles filename dependencies because it generates targets for every output file from a command. For example:
add_custom_command(OUTPUT foo.txt COMMAND touch foo.txt)
With the Ninja generator this generates a target foo.txt, but with the Makefile generator it doesn't. This is probably because Ninja explicitly requires these hard dependency ties, and Make just behaves oddly in general.
To fix this we need to make the tablegen actions depend on a target rather than a filename.
llvm-svn: 235732
Check that `@main` is calling `@foo2` (the renamed internal function),
not the `@foo` with external linkage that's been pulled in from the
override file.
llvm-svn: 235730
Summary:
Perform integer extension only when the destination type is one of
i8, i16 & i32 and when the source type is i1, i8 or i16. For other
combinations we fall back to SelectionDAG.
This fixes the test MultiSource/Benchmarks/7zip that was failing in our
out-of-tree MIPS buildbots.
Reviewers: dsanders
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D9243
llvm-svn: 235718
Summary: Constant folding of extractelement with out-of-bound index produces undef also for indexes bigger than 64bit (instead of crash assert failure as before)
Test Plan: Unit tests included.
Reviewers: majnemer
Reviewed By: majnemer
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D9225
llvm-svn: 235700
Summary: This patch fixes step D4 of Knuth's division algorithm implementation. Negative sign of the step result was not always detected due to incorrect "borrow" handling.
Test Plan: Unit test that reveals the bug included.
Reviewers: chandlerc, yaron.keren
Reviewed By: yaron.keren
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D9196
llvm-svn: 235699
Summary:
Fixes a bug in the NVPTX codegen. The code used to miss necessary "generic()"
on aggregates of addrspacecasts.
Test Plan: addrspacecast-gvar.ll
Reviewers: eliben, jholewinski
Reviewed By: jholewinski
Subscribers: jholewinski, llvm-commits
Differential Revision: http://reviews.llvm.org/D9130
llvm-svn: 235689
This should make it clear under which narrow circumstances implicit
physreg uses are okay when rematerializing and prevent people from
accidentally allowing too much when overriding
isReallyTriviallyReMaterializable() even with the weaker assert in the
RegisterCoalescer.
llvm-svn: 235679
Currently symbol names are printed in quotes if it contains something
outside of the arbitrary set of characters that isAcceptableChar tests
for. On somem targets, it is never OK to print a symbol name in quotes
so allow targets to opt out of this behavior.
llvm-svn: 235670
I couldn't provide a testcase as none of the public targets has wide
register classes with alot of subregisters and at the same time an
instruction which "ReMaterializable" and "AsCheapAsAMove" (could
probably be added for R600).
llvm-svn: 235668
Match binutils by supporting the optional register name prefix for new vector
registers ("vs" for VSX registers and "q" for QPX registers).
llvm-svn: 235665
Add assembler/disassembler support for dcbt/dcbtst (and aliases) with the hint
field specified (non-zero). Unforunately, the syntax for this instruction is
special in that it differs for server vs. embedded cores:
dcbt ra, rb, th [server]
dcbt th, ra, rb [embedded]
where th can be omitted when it is 0. dcbtst is the same. Thus we need to play
games in the parser and the printer to flip the operands around on the embedded
cores. We'll use the server syntax as the default (binutils currently uses the
embedded form by default, but IBM is changing that).
We also stop marking dcbtst as having unmodeled side effects (this is not
necessary, it is just a hint like dcbt -- noticed by inspection, so no separate
test case).
llvm-svn: 235657
(reverted in r235533)
Original commit message:
"Calls to llvm::Value::mutateType are becoming extra-sensitive now that
instructions have extra type information that will not be derived from
operands or result type (alloca, gep, load, call/invoke, etc... ). The
special-handling for mutateType will get more complicated as this work
continues - it might be worth making mutateType virtual & pushing the
complexity down into the classes that need special handling. But with
only two significant uses of mutateType (vectorization and linking) this
seems OK for now.
Totally open to ideas/suggestions/improvements, of course.
With this, and a bunch of exceptions, we can roundtrip an indirect call
site through bitcode and IR. (a direct call site is actually trickier...
I haven't figured out how to deal with the IR deserializer's lazy
construction of Function/GlobalVariable decl's based on the type of the
entity which means looking through the "pointer to T" type referring to
the global)"
The remapping done in ValueMapper for LTO was insufficient as the types
weren't correctly mapped (though I was using the post-mapped operands,
some of those operands might not have been mapped yet so the type
wouldn't be post-mapped yet). Instead use the pre-mapped type and
explicitly map all the types.
llvm-svn: 235651
We were asserting on code like this:
extern "C" unsigned long _exception_code();
void might_crash(unsigned long);
void foo() {
__try {
might_crash(0);
} __except(1) {
might_crash(_exception_code());
}
}
Gtest and many other libraries get the exception code from the __except
block. What's supposed to happen here is that EAX is live into the
__except block, and it contains the exception code. Eventually we'll
represent that as a use of the landingpad ehptr value, but for now we
can replace it with undef.
llvm-svn: 235649
remove copies that are useful after breaking some hardware dependencies.
In other words, handle this kind of situations conservatively by assuming reg2
is redefined by the undef flag.
reg1 = copy reg2
= inst reg2<undef>
reg2 = copy reg1
Copy propagation used to remove the last copy.
This is incorrect because the undef flag on reg2 in inst, allows next
passes to put whatever trashed value in reg2 that may help.
In practice we end up with this code:
reg1 = copy reg2
reg2 = 0
= inst reg2<undef>
reg2 = copy reg1
This fixes PR21743.
llvm-svn: 235647
When the base register index of the vector plus the constant offset
was less than zero, we were passing the wrong base register to the indirect
addressing instruction.
In this case, we need to set the base register to v0 and then add
the computed (negative) index to m0.
llvm-svn: 235641
The order in which branches appear in ImmBranches is approximately their
order within the function body. By visiting later branches first, we reduce
the distance between earlier forward branches and their targets, making it
more likely that the cbn?z optimization, which can only apply to forward
branches, will succeed for those earlier branches.
Differential Revision: http://reviews.llvm.org/D9185
llvm-svn: 235640
In particular, this preserves the kill flag, which allows the Thumb2 cbn?z
optimization to be applied in cases where a branch has been re-created after
the live variables analysis pass, e.g. by the machine block placement pass.
This appears to be low risk; a number of other targets seem to already be
doing something similar, e.g. AArch64, PowerPC.
Differential Revision: http://reviews.llvm.org/D9184
llvm-svn: 235639
This allows the constant island pass to lower these branches to cbn?z
instructions, resulting in a shorter instruction sequence.
Differential Revision: http://reviews.llvm.org/D9183
llvm-svn: 235638
This makes it more likely that we can use the 16-bit push and pop instructions
on Thumb-2, saving around 4 bytes per function.
Differential Revision: http://reviews.llvm.org/D9165
llvm-svn: 235637
This appears to have been introduced back in r76698 as part of an unrelated
change. I can find no official ARM documentation stating that Thumb-2 functions
require 4-byte alignment; in fact, ARM documentation appears to contradict
this (see, e.g., ARM Architecture Reference Manual Thumb-2 Supplement,
section 2.6.1: "Thumb-2 enforces 16-bit alignment on all instructions.").
Also remove code that sets alignment for ARM functions, which is redundant
with code in the MachineFunction constructor, and remove the hidden
-arm-align-constant-islands flag, which has been enabled by default since
r146739 (Dec 2011) and has probably received sufficient testing by now.
Differential Revision: http://reviews.llvm.org/D9138
llvm-svn: 235636
Specifically, if a pointer accesses different underlying objects in each
iteration, don't look through the phi node defining the pointer.
The motivating case is the underlyling-objects-2.ll testcase. Consider
the loop nest:
int **A;
for (i)
for (j)
A[i][j] = A[i-1][j] * B[j]
This loop is transformed by Load-PRE to stash away A[i] for the next
iteration of the outer loop:
Curr = A[0]; // Prev_0
for (i: 1..N) {
Prev = Curr; // Prev = PHI (Prev_0, Curr)
Curr = A[i];
for (j: 0..N)
Curr[j] = Prev[j] * B[j]
}
Since A[i] and A[i-1] are likely to be independent pointers,
getUnderlyingObjects should not assume that Curr and Prev share the same
underlying object in the inner loop.
If it did we would try to dependence-analyze Curr and Prev and the
analysis of the corresponding SCEVs would fail with non-constant
distance.
To fix this, the getUnderlyingObjects API is extended with an optional
LoopInfo parameter. This is effectively what controls whether we want
the above behavior or the original. Currently, I only changed to use
this approach for LoopAccessAnalysis.
The other testcase is to guard the opposite case where we do want to
look through the loop PHI. If we step through an array by incrementing
a pointer, the underlying object is the incoming value of the phi as the
loop is entered.
Fixes rdar://problem/19566729
llvm-svn: 235634
Summary:
We pick this order because SeparateConstOffsetFromGEP may create more
opportunities for SLSR.
Test Plan:
reassociate-geps-and-slsr.ll
no performance regression on internal benchmarks
Reviewers: meheff
Subscribers: llvm-commits, jholewinski
Differential Revision: http://reviews.llvm.org/D9230
llvm-svn: 235632
TableGen had been nicely generating code to print a number of instructions using
shorter aliases (and PowerPC has plenty of short mnemonics), but we were not
calling it. For some of the aliases we support in the parser, TableGen can't
infer the "inverse" alias relationship, so there is still more to do.
Thus, after some hours of updating test cases...
llvm-svn: 235616
This will enable us to create a PDBContext so as to expose some
amount of debug info functionality through a common interace.
Differential Revision: http://reviews.llvm.org/D9205
Reviewed by: Alexey Samsonov
llvm-svn: 235612
Move isDereferenceablePointer function to Analysis. This function recursively tracks dereferencability over a chain of values like other functions in ValueTracking.
This refactoring is motivated by further changes to support dereferenceable_or_null attribute (http://reviews.llvm.org/D8650). isDereferenceablePointer will be extended to perform context-sensitive analysis and IR is not a good place to have such functionality.
Patch by: Artur Pilipenko <apilipenko@azulsystems.com>
Differential Revision: reviews.llvm.org/D9075
llvm-svn: 235611
Summary:
Constant stores of f16 vectors can create NvCast nodes from various
operand types to v4f16 or v8f16 depending on patterns in the stored
constants. This patch adds nvcast rules with v4f16 and v8f16 values.
AArchISelLowering::LowerBUILD_VECTOR has the details on which constant
patterns generate the nvcast nodes.
Reviewers: jmolloy, srhines, ab
Subscribers: rengolin, aemerson, llvm-commits
Differential Revision: http://reviews.llvm.org/D9201
llvm-svn: 235610
Summary:
Set operation action for SINT_TO_FP and UINT_TO_FP nodes with v4i32,
v8i8, v8i16 inputs to allow promotion of v4f16 results.
Add tests for sitofp and uitofp for vec4, vec8, vec16, and i8, i16, i32,
and i64 vectors. Only missing tests are for v16i8 and v16i16 as the
shift operations are too complicated to write a proper check sequence.
The conversions from v4i64 to v4f16 do not depend on this patch - v4i64
is split and the conversion gets handled while lowering v2i64. I am
adding a test here for completeness.
Reviewers: aemerson, rengolin, ab, jmolloy, srhines
Subscribers: rengolin, aemerson, llvm-commits
Differential Revision: http://reviews.llvm.org/D9166
llvm-svn: 235609
Third time's the charm. The previous commit was reverted as a
reverse for-loop in SelectionDAGBuilder::lowerWorkItem did 'I--'
on an iterator at the beginning of a vector, causing asserts
when using debugging iterators. This commit fixes that.
llvm-svn: 235608
Summary:
Make sure the abbrev operands are valid and that we can read/skip them
afterwards.
Bug found with AFL fuzz.
Reviewers: rafael
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D9030
llvm-svn: 235595
Patch to remove extra bitcasts from shuffles, this is often a legacy of XformToShuffleWithZero being used to combine bitmaskings (of float vectors bitcast to integer vectors) into shuffles: bitcast(shuffle(bitcast(s0),bitcast(s1))) -> shuffle(s0,s1)
Differential Revision: http://reviews.llvm.org/D9097
llvm-svn: 235578
This patch refactors the definition of common utility function "isInductionPHI" to LoopUtils.cpp.
This fixes compilation error when configured with -DBUILD_SHARED_LIBS=ON
llvm-svn: 235577
This is a re-commit of r235101, which also fixes the problems with the previous patch:
- Switches with only a default case and non-fallthrough were handled incorrectly
- The previous patch tickled a bug in PowerPC Early-Return Creation which is fixed here.
> This is a major rewrite of the SelectionDAG switch lowering. The previous code
> would lower switches as a binary tre, discovering clusters of cases
> suitable for lowering by jump tables or bit tests as it went along. To increase
> the likelihood of finding jump tables, the binary tree pivot was selected to
> maximize case density on both sides of the pivot.
>
> By not selecting the pivot in the middle, the binary trees would not always
> be balanced, leading to performance problems in the generated code.
>
> This patch rewrites the lowering to search for clusters of cases
> suitable for jump tables or bit tests first, and then builds the binary
> tree around those clusters. This way, the binary tree will always be balanced.
>
> This has the added benefit of decoupling the different aspects of the lowering:
> tree building and jump table or bit tests finding are now easier to tweak
> separately.
>
> For example, this will enable us to balance the tree based on profile info
> in the future.
>
> The algorithm for finding jump tables is quadratic, whereas the previous algorithm
> was O(n log n) for common cases, and quadratic only in the worst-case. This
> doesn't seem to be major problem in practice, e.g. compiling a file consisting
> of a 10k-case switch was only 30% slower, and such large switches should be rare
> in practice. Compiling e.g. gcc.c showed no compile-time difference. If this
> does turn out to be a problem, we could limit the search space of the algorithm.
>
> This commit also disables all optimizations during switch lowering in -O0.
>
> Differential Revision: http://reviews.llvm.org/D8649
llvm-svn: 235560
Only clear out the NSW/NUW flags if we are optimizing 'add'/'sub' while
taking advantage that the sign bit is not set. We do this optimization
to further shrink the mask but shrinking the mask isn't NSW/NUW
preserving in this case.
llvm-svn: 235558
This removes the -sehprepare flag and makes __C_specific_handler
functions always to use WinEHPrepare.
This was tested by building all of chromium_builder_tests and running a
few tests that use SEH, but if something breaks, we can revert this.
llvm-svn: 235557
In particular, this handles SSA values that are live *out* of a handler.
The existing code only handles values that are live *in* to a handler.
It also handles phi nodes in the block where normal control should
resume after the end of a catch handler. When EH return points have phi
nodes, we need to split the return edge. It is impossible for phi
elimination to emit copies in the previous block if that block gets
outlined. The indirectbr that we leave in the function is only notional,
and is eliminated from the MachineFunction CFG early on.
Reviewers: majnemer, andrew.w.kaylor
Differential Revision: http://reviews.llvm.org/D9158
llvm-svn: 235545
An nsw/nuw operation relies on the values feeding into it to not
overflow if 'poison' is not to be produced. This means that
optimizations which make modifications to the bottom of a chain (like
SimplifyDemandedBits) must strip out nsw/nuw if they cannot ensure that
they will be preserved.
This fixes PR23309.
llvm-svn: 235544
I added the poisoning back in r76891 (2009) because of some bugs in
Unladen Swallow, and then Evan Cheng added the setRangeWritable() call
in r81308. Profiling a Release+Asserts build on Windows shows that this
memory protection call is actually very expensive. 4 seconds of a 70
second Clang compilation are spent in VirtualQuery. These days we have
more reliable tools like ASan to find these kinds of bugs, so we can go
ahead and retire these checks.
llvm-svn: 235542
This reverts commit r235458.
It looks like this might be breaking something LTO-ish. Looking into it
& will recommit with a fix/test case/etc once I've got more to go on.
llvm-svn: 235533
The CondOpt pass currently uses LiveIntervals to set the dead flag on a def. This patch uses MachineRegisterInfo::use_empty instead as that is equivalent to the def being dead.
This removes an instance of LiveIntervals in the pass manager pipeline and saves 3.8% of compile time on llc conpiled for AArch64.
Reviewed by Chad Rosier and Zhaoshi.
llvm-svn: 235532
Summary:
Remove the CHECK-DAG calls introduced in r235341, and add a comment that
this test may break due to scheduling variations.
This patch completes the fix discussed in http://reviews.llvm.org/D8804
Reviewers: dsanders, srhines
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D9178
llvm-svn: 235530
This causes OpKind and all the bitfields after it to use 32-bit load/stores instead of i24's for the existing bitfields.
Bugs will be filed to track whether clang and llvm could have generated the 32-bit operations in the front-end or optimizer.
Reviewed by Rafael.
llvm-svn: 235528
ARM32 ELF R_ARM_V4BX relocation format is a special relocation type
that records the location of an ARMv4t BX instruction to enable a
static linker to generate ARMv4 compatible instructions. This
relocation does not contain a reference symbol.
This patch enabled its creation by removing the requeriment of a
relocation symbol target in ELFState<ELFT>::writeSectionContent.
llvm-svn: 235513
An assert was triggered when attempting to create a new SCEV
with operands of different types in the visitAddRecExpr. In this
test case, the operand types of the numerator and denominator
are different. The SCEV division code should generate a
conservative answer when this happens.
Differential Revision: http://reviews.llvm.org/D9021
llvm-svn: 235511
This fixes a regression introduced at revision 218263.
On AVX, if we optimize for size, a splat build_vector of a load
is lowered into a VBROADCAST node. This is done even if the value type of the
splat build_vector node is v2i64.
Since AVX doesn't support v2f64/v2i64 broadcasts, revision 218263 added two
extra tablegen patterns to allow selecting a VMOVDDUPrm from an X86VBroadcast
where the scalar element comes from a loadi64/loadf64.
However, revision 218263 forgot to add an extra fallback pattern for the case
where we have a X86VBroadcast of a loadi64 with multiple uses.
This patch adds the missing tablegen pattern in X86InstrSSE.td.
This patch also adds an extra test to 'splat-for-size.ll' to verify that ISel
doesn't crash with a 'fatal error in the backend' due to a missing AVX pattern
to select v2i64 X86ISD::BROADCAST nodes.
llvm-svn: 235509
This turned up after r235333, but was a pre-existing bug. The optimization
which transforms select(c, load, load) into a load of a select of the addresses
does not handle indexed loads (pre/post inc/dec). However, it did not check for
them either, leading to a crash if it tried to transform one of them.
llvm-svn: 235497
Enough concerns were raised that this optimization is pessimising some code patterns.
The obvious fix, to add a Reassociate run afterwards, causes even more pessimisation in some cases due to fewer complex addressing modes being matched. As there isn't a trivial fix for this, backing this out by default until someone gets a chance to fix the addressing mode matcher.
llvm-svn: 235491
X86 backend.
The code generated for symbolic targets is identical to the code generated for
constant targets, except that a relocation is emitted to fix up the actual
target address at link-time. This allows IR and object files containing
patchpoints to be cached across JIT-invocations where the target address may
change.
llvm-svn: 235483
Previously the code was accidentally checking if 'this' was an IntRecTy which it can't be since 'this' is a BitRecTy. Looking back at the history it appears it was intended to check RHS.
llvm-svn: 235477
Without pointee types the space optimization of storing only the pointer
type and not the value type won't be viable - so add the extra type
information that would be missing.
llvm-svn: 235475
Without pointee types the space optimization of storing only the pointer
type and not the value type won't be viable - so add the extra type
information that would be missing.
Storeatomic coming soon.
llvm-svn: 235474
Add a flag to lib/Linker (and `llvm-link`) to override linkage rules.
When set, the functions in the source module *always* replace those in
the destination module.
The `llvm-link` option is `-override=abc.ll`. All the "regular" modules
are loaded and linked first, followed by the `-override` modules. This
is useful for debugging workflows where some subset of the module (e.g.,
a single function) is extracted into a separate file where it's
optimized differently, before being merged back in.
Patch by Luqman Aden!
llvm-svn: 235473
Factor the loop for linking input files together into a combined module
into a separate function. This is in preparation for an upcoming patch
that runs the logic twice.
Patch by Luqman Aden!
llvm-svn: 235472
Turns out I misread the parentheses. Though I'm pretty sure its always a RecordRecTy and non of the callers really seem to expect null. But until I'm completely sure I'm going to revert this.
llvm-svn: 235469
With SSE2, we can generate a 'movq' or other 64-bit store op on a 32-bit system
even though 64-bit integers are not legal types.
So instead of producing this:
pshufd $229, %xmm0, %xmm1 ## xmm1 = xmm0[1,1,2,3]
movd %xmm0, (%eax)
movd %xmm1, 4(%eax)
We can do:
movq %xmm0, (%eax)
This is a fix for the problem noted in D7296.
Differential Revision: http://reviews.llvm.org/D9134
llvm-svn: 235460
We should also teach the inliner to collapse framerecover of
frameaddress of the current frame down to an alloca, but that can happen
later.
llvm-svn: 235459
Calls to llvm::Value::mutateType are becoming extra-sensitive now that
instructions have extra type information that will not be derived from
operands or result type (alloca, gep, load, call/invoke, etc... ). The
special-handling for mutateType will get more complicated as this work
continues - it might be worth making mutateType virtual & pushing the
complexity down into the classes that need special handling. But with
only two significant uses of mutateType (vectorization and linking) this
seems OK for now.
Totally open to ideas/suggestions/improvements, of course.
With this, and a bunch of exceptions, we can roundtrip an indirect call
site through bitcode and IR. (a direct call site is actually trickier...
I haven't figured out how to deal with the IR deserializer's lazy
construction of Function/GlobalVariable decl's based on the type of the
entity which means looking through the "pointer to T" type referring to
the global)
llvm-svn: 235458
https://llvm.org/bugs/show_bug.cgi?id=23163.
Gep merging sometimes behaves like a reverse CSE/LICM optimization,
which has negative impact on performance. In this patch we restrict
gep merging to happen only when the indexes to be merged are both consts,
which ensures such merge is always beneficial.
The patch makes gep merging only happen in very restrictive cases.
It is possible that some analysis/optimization passes rely on the merged
geps to get better result, and we havn't notice them yet. We will be ready
to further improve it once we see the cases.
Differential Revision: http://reviews.llvm.org/D8911
llvm-svn: 235455
https://llvm.org/bugs/show_bug.cgi?id=23163.
Gep merging sometimes behaves like a reverse CSE/LICM optimizations,
which has negative impact on performance. In this patch we restrict
gep merging to happen only when the indexes to be merged are both consts,
which ensures such merge is always beneficial.
The patch makes gep merging only happen in very restrictive cases.
It is possible that some analysis/optimization passes rely on the merged
geps to get better result, and we havn't notice them yet. We will be ready
to further improve it once we see the cases.
Differential Revision: http://reviews.llvm.org/D9007
llvm-svn: 235451
MemIntrinsic::getDest() looks through pointer casts, and using it
directly when building the new GEP+memset results in stuff like:
%0 = getelementptr i64* %p, i32 16
%1 = bitcast i64* %0 to i8*
call ..memset(i8* %1, ...)
instead of the correct:
%0 = bitcast i64* %p to i8*
%1 = getelementptr i8* %0, i32 16
call ..memset(i8* %1, ...)
Instead, use getRawDest, which just gives you the i8* value.
While there, use the memcpy's dest, as it's live anyway.
In most cases, when the optimization triggers, the memset and memcpy
sizes are the same, so the built memset is 0-sized and eliminated.
The problem occurs when they're different.
Fixes a regression caused by r235232: PR23300.
llvm-svn: 235419
Summary:
This lets us use range based for loops.
Reviewers: chandlerc
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D9169
llvm-svn: 235416
Summary:
With D9096 and D9101, there's no need to run DCE after SLSR and
SeparateConstOffsetFromGEP.
Test Plan: no regression
Reviewers: jholewinski, meheff
Subscribers: jholewinski, llvm-commits
Differential Revision: http://reviews.llvm.org/D9172
llvm-svn: 235415
Remove the `DIArray` and `DITypeArray` typedefs, preferring the
underlying types (`DebugNodeArray` and `MDTypeRefArray`, respectively).
llvm-svn: 235413
Summary:
After we rewrite a candidate, the instructions used by the old form may
become unused. This patch cleans up these unused instructions so that we
needn't run DCE after SLSR.
Test Plan: removed -dce in all the SLSR tests
Reviewers: broune, meheff
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D9101
llvm-svn: 235410
Summary: so that we needn't run DCE after this pass.
Test Plan: removed -dce from the commandline in split-gep.ll and split-gep-and-gvn.ll
Reviewers: meheff
Subscribers: llvm-commits, HaoLiu, hfinkel, jholewinski
Differential Revision: http://reviews.llvm.org/D9096
llvm-svn: 235409
Summary:
MemorySSA uses this algorithm as well, and this enables us to reuse the code in both places.
There are no actual algorithm or datastructure changes in here, just code movement.
Reviewers: qcolombet, chandlerc
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D9118
llvm-svn: 235406
Remove early returns for when `getVariable()` is null, and just assert
that it never happens. The Verifier already confirms that there's a
valid variable on these intrinsics, so we should assume the debug info
isn't broken. I also updated a check for a `!dbg` attachment, which the
Verifier similarly guarantees.
llvm-svn: 235400
Keep the old SEH fan-in lowering on by default for now, since projects
rely on it. This will make it easy to test this change with a simple
flag flip.
llvm-svn: 235399
There doesn't seem to be a reason to perform this target ISD node matching
in an DAGCombine, moving it to lowering fixes PR23296.
Differential Revision: http://reviews.llvm.org/D9137
llvm-svn: 235394
Summary:
This directive is exactly the same as .asciz, except it's only used by MIPS.
It is used to store null terminated strings in object files.
Reviewers: rafael, dsanders, echristo
Reviewed By: dsanders, echristo
Subscribers: echristo, llvm-commits
Differential Revision: http://reviews.llvm.org/D7530
llvm-svn: 235382
Summary:
The 64-bit version of the variable shift instructions uses the
shift_rotate_reg class which uses a GPR32Opnd to specify the variable
shift amount. With this patch we avoid the generation of a redundant
SLL instruction for the variable shift instructions in 64-bit targets.
Reviewers: dsanders
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D7413
llvm-svn: 235376
This is an updated version of Chandler's patch D7402 that got accepted but never committed, and has bit-rotted a bit since.
I've updated the execution domain declarations to match the approach of the packed templates and also added some extra scalar unary tests.
Differential Revision: http://reviews.llvm.org/D9095
llvm-svn: 235372
Fixed issue with the combine of CONCAT_VECTOR of 2 BUILD_VECTOR nodes - the optimisation wasn't ensuring that the scalar operands of both nodes were the same type/size for implicit truncation.
Test case spotted by Patrik Hagglund
llvm-svn: 235371
Summary:
This fixes http://llvm.org/bugs/show_bug.cgi?id=16439.
This is one possible way to approach this. The other would be to split InL>>(nbits-Amt) into (InL>>(nbits-1-Amt))>>1, which is also valid since since we only need to care about Amt up nbits-1. It's hard to tell which one is better since the shift might be expensive if this stage of expansion is not yet a legal machine integer, whereas comparisons with zero are relatively cheap at all sizes, but more expensive than a shift if the shift is on a legal machine type.
Patch by Keno Fischer!
Test Plan: regression test from http://reviews.llvm.org/D7752
Reviewers: chfast, resistor
Reviewed By: chfast, resistor
Subscribers: sanjoy, resistor, chfast, llvm-commits
Differential Revision: http://reviews.llvm.org/D4978
llvm-svn: 235370
This brings the utils/vim folder into a more vim-like format by moving
the syntax hightlighting files into a syntax subdirectory. It adds
some minimal settings that everyone should agree on to ftdetect/ftplugin and
features a new indentation plugin for .ll files.
llvm-svn: 235369
X86ISD::ADDSUB, X86ISD::(F)HADD, X86ISD::(F)HSUB should not be selected
if the operand types do not match the result type because vector type
legalization cannot deal with this for custom nodes.
Testcase X86ISD::ADDSUB is attached. I could not create a testcase for
the FHADD/FHSUB cases because of: https://llvm.org/bugs/show_bug.cgi?id=23296
Differential Revision: http://reviews.llvm.org/D9120
llvm-svn: 235367
Summary:
Bundle aligment requires that the functions always start at an aligned address.
Usually this is ensured by the compiler, but assembly code does not always
begin with a .align directive.
This change ensures that sections get the correct alignment if they contain
any instructions and bundling is enabled. (It also makes LLVM match the
behavior of GNU as).
Differential Revision: http://reviews.llvm.org/D9131
llvm-svn: 235365
Summary:
In the f16-promote test, make the checks for native conversion instructions
similar to the libcall checks:
- Remove hard coded register names
- Do not check exact instruction sequences.
This fixes test flakiness due to non-determinism in instruction
scheduling and register allocation. I also fixed a few minor things in
the CHECK-LIBCALL checks.
I'll try to find a way to check that unnecessary loads, stores, or
conversions don't happen.
Reviewers: mzolotukhin, srhines, ab
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D9112
llvm-svn: 235363
Summary:
This patch adds two flags to `bugpoint`: "-replace-funcs-with-null" and "-disable-pass-list-reduction".
When "-replace-funcs-with-null" is specified, bugpoint will, instead of simply deleting function bodies, replace all uses of functions and then will delete functions completely from the test module, correctly handling aliasing and @llvm.used && @llvm.compiler.used. This part was conceived while trying to debug the PNaCl IR simplification passes, which don't allow undefined functions (ie no declarations).
With "-disable-pass-list-reduction", bugpoint won't try to reduce the set of passes causing the "crash". This is needed in cases where one is trying to debug an issue inside the PNaCl IR simplification passes which is causing an PNaCl ABI verification error, for example.
Reviewers: jfb
Reviewed By: jfb
Subscribers: jfb, llvm-commits
Differential Revision: http://reviews.llvm.org/D8555
llvm-svn: 235362
Delete subclasses of (the already deleted) `DIType` in favour of
directly using pointers from the `Metadata` hierarchy.
While `DICompositeType` wraps `MDCompositeTypeBase` and `DIDerivedType`
wraps `MDDerivedTypeBase`, most uses of each really meant the more
specific `MDCompositeType` and `MDDerivedType`.
llvm-svn: 235351
The version of `constructTypeDIE()` for `MDSubroutineType` is unrelated
to (and has different callers than) the `MDCompositeType`. Split the
two in half.
This simplifies an upcoming patch to delete `DICompositeType`. There
shouldn't be any real functionality change here. `createTypeDIE()` is
`cast<>`'ing where it didn't need to before, but that function in turn
is only called for true `MDCompositeType`s.
llvm-svn: 235349
the function body.
This is necessary for correctness when lazily compiling.
Also, flesh out the Orc unit test infrastructure slightly, and add a unit test
for this.
llvm-svn: 235347
Update comment style in `DwarfUnit`.
- Drop duplicated comments at definition, and update the comments at
the declaration where the definition comments looked newer or more
complete.
- Drop the `functionName -` prefix.
- Add `\brief` in a few places.
- Remove a few comments entirely that weren't adding value (just
turned the function name and arguments into a sentence).
llvm-svn: 235345
Summary:
Set operation action for FP16 conversion opcodes, so the Op legalizer
can choose the gnu_* libcalls for Mips.
Set LoadExtAction and TruncStoreAction for f16 scalars and vectors to
prevent (fpext (load )) and (store (fptrunc)) from getting combined into
unsupported operations.
Added test cases to test that these operations are handled correctly
for f16 scalars and vectors. This patch depends on
http://reviews.llvm.org/D8755.
Reviewers: srhines
Subscribers: llvm-commits, ab
Differential Revision: http://reviews.llvm.org/D8804
llvm-svn: 235341
This is the last major parent class, so I'll probably start deleting
classes in batches now. Looks like many of the references to the DI*
hierarchy were updated organically along the way.
llvm-svn: 235331
Replace uses of `DIScope` with `MDScope*`. There was one spot where
I've left an `MDScope*` uninitialized (where `DIScope` would have been
default-initialized to `nullptr`) -- this is intentional, since the
if/else that follows should unconditional assign it to a value.
llvm-svn: 235327
This adds the following targets to cmake. These can be used to build and link only specific parts of a backend, instead of having to link the whole backend.
- AllTargetsAsmPrinters, AllTargetsAsmParsers, AllTargetsDescs, AllTargetsDisassemblers, AllTargetsInfos
A typical use for these is instead of linking ${LLVM_TARGETS_TO_BUILD}. This commit changes llvm-mc to show how to use the new targets.
Reviewed by Chris Bieneman.
llvm-svn: 235324
This commit fixes the code which adds lifetime markers in InlineFunction to skip
zero-sized allocas instead of asserting on them.
rdar://problem/20531155
llvm-svn: 235312
n/1 generates a quotient equal to n and a remainder of 0.
If this case is not recognized, then the SCEV divide() function
can return a remainder that is greater than or equal to the
denominator, which means the delinearized subscripts for the
test case will be incorrect.
Differential Revision: http://reviews.llvm.org/D9003
llvm-svn: 235311