For Swift we would like to be able to encode the error types that a
function may throw, so the debugger can display them alongside the
function's return value when finish-ing a function.
DWARF defines DW_TAG_thrown_type (intended to be used for C++ throw()
declarations) that is a perfect fit for this purpose. This patch wires
up support for DW_TAG_thrown_type in LLVM by adding a list of thrown
types to DISubprogram.
To offset the cost of the extra pointer, there is a follow-up patch
that turns DISubprogram into a variable-length node.
rdar://problem/29481673
Differential Revision: https://reviews.llvm.org/D32559
llvm-svn: 301489
The previous algorithm processed one character at a time, which is very
painful on a modern CPU. Replace it with xxHash64, which both already
exists in the codebase and is fairly fast.
Patch from Scott Smith!
Differential Revision: https://reviews.llvm.org/D32509
llvm-svn: 301487
Besides better codegen, the motivation is to be able to canonicalize this pattern
in IR (currently we don't) knowing that the backend is prepared for that.
This may also allow removing code for special constant cases in
DAGCombiner::foldSelectOfConstants() that was added in D30180.
Differential Revision: https://reviews.llvm.org/D31944
llvm-svn: 301457
Summary of changes:
- corrected vmcnt, expcnt, lgkmcnt helpers to checks their argument for truncation;
- added saturated versions of these helpers.
See bug 32711 for details: https://bugs.llvm.org//show_bug.cgi?id=32711
Reviewers: artem.tamazov, vpykhtin
Differential Revision: https://reviews.llvm.org/D32546
llvm-svn: 301439
Marking them as used causes them to be considered visible outside of LTO. This
prevents the symbols from being internalized or discarded, either by GlobalDCE
or by summary-based dead stripping in ThinLTO.
This change makes it unnecessary to add these symbols to llvm.compiler.used
in the backend, as the symbols are kept alive by virtue of being external,
so remove the backend code that handles that.
Fixes PR32798.
Differential Revision: https://reviews.llvm.org/D32544
llvm-svn: 301438
This patch introduces a new KnownBits struct that wraps the two APInt used by computeKnownBits. This allows us to treat them as more of a unit.
Initially I've just altered the signatures of computeKnownBits and InstCombine's simplifyDemandedBits to pass a KnownBits reference instead of two separate APInt references. I'll do similar to the SelectionDAG version of computeKnownBits/simplifyDemandedBits as a separate patch.
I've added a constructor that allows initializing both APInts to the same bit width with a starting value of 0. This reduces the repeated pattern of initializing both APInts. Once place default constructed the APInts so I added a default constructor for those cases.
Going forward I would like to add more methods that will work on the pairs. For example trunc, zext, and sext occur on both APInts together in several places. We should probably add a clear method that can be used to clear both pieces. Maybe a method to check for conflicting information. A method to return (Zero|One) so we don't write it out everywhere. Maybe a method for (Zero|One).isAllOnesValue() to determine if all bits are known. I'm sure there are many other methods we can come up with.
Differential Revision: https://reviews.llvm.org/D32376
llvm-svn: 301432
Commits were:
"Use WeakVH instead of WeakTrackingVH in AliasSetTracker's UnkownInsts"
"Add a new WeakVH value handle; NFC"
"Rename WeakVH to WeakTrackingVH; NFC"
The changes assumed pointers are 8 byte aligned on all architectures.
llvm-svn: 301429
Summary:
In cases where an instruction (a call site, say) is RAUW'ed with some
other value (this is possible via the `returned` attribute, amongst
other things), we want the slot in UnknownInsts to point to the
original Instruction we wanted to track, not the value it got replaced
by.
Fixes PR32587.
Reviewers: davide
Subscribers: mcrosier, llvm-commits
Differential Revision: https://reviews.llvm.org/D32268
llvm-svn: 301426
Summary:
WeakVH nulls itself out if the value it was tracking gets deleted, but
it does not track RAUW.
Reviewers: dblaikie, davide
Subscribers: mcrosier, llvm-commits
Differential Revision: https://reviews.llvm.org/D32267
llvm-svn: 301425
Summary:
I plan to use WeakVH to mean "nulls itself out on deletion, but does
not track RAUW" in a subsequent commit.
Reviewers: dblaikie, davide
Reviewed By: davide
Subscribers: arsenm, mehdi_amini, mcrosier, mzolotukhin, jfb, llvm-commits, nhaehnle
Differential Revision: https://reviews.llvm.org/D32266
llvm-svn: 301424
The SampleProfWriter emits function information in an order determined
by the string hash function. The situation is a bit brittle, because
changing the hash function can break the tests.
Instead of sorting the function samples to get a relaible ordering (that
might be too expensive), make the tests not depend on a particular
ordering of function samples.
Differential Revision: https://reviews.llvm.org/D32516
llvm-svn: 301419
Build vectors have magical truncation powers, so we have things like this:
v4i1 = BUILD_VECTOR Constant:i32<1>, Constant:i32<1>, Constant:i32<1>, Constant:i32<1>
v4i16 = BUILD_VECTOR Constant:i32<1>, Constant:i32<1>, Constant:i32<1>, Constant:i32<1>
If we don't truncate the splat node returned by getConstantSplatNode(), then we won't find
truth when ZeroOrNegativeOneBooleanContent is the rule.
Differential Revision: https://reviews.llvm.org/D32505
llvm-svn: 301408
For targets that don't have ISD::MULHS or ISD::SMUL_LOHI for the type
and the double width type is illegal, then the two operands are
sign extended to twice their size then multiplied to check for overflow.
The extended upper halves were mismatched causing an incorrect result.
This fixes the mismatch.
A test was added for ARM V6-M where the bug was detected.
Patch by James Duley.
Differential Revision: https://reviews.llvm.org/D31807
llvm-svn: 301404
Summary:
Otherwise we might end up with some empty basic blocks or
single-entry-single-exit basic blocks.
This fixes PR32085
Reviewers: chandlerc, danielcdh
Subscribers: mehdi_amini, RKSimon, llvm-commits
Differential Revision: https://reviews.llvm.org/D30468
llvm-svn: 301395
Removed micro mips register classes for gp initialization because gp initialization uses pure mips64 instruction. Even when compiling for micro mips, gp initialization can be done with pure mips64 instructions.
Reviewed by Simon Dardis
Differential: D32286
llvm-svn: 301394
r299766 contained a "conditional move or jump depends on uninitialized value"
fault, identified by valgrind. This occurred as MipsFastISel::finishCall(..)
used CCState over MipsCCState. The latter is required for the TableGen'd calling
convention logic due to reliance on pre-analyzing type information to lower call
results/returns of vectors correctly.
This change modifies the MipsCC AnalyzeCallResult to be useful with both the
SelectionDAG and FastISel lowering logic.
Reviewers: slthakur
Differential Revision: https://reviews.llvm.org/D32004
llvm-svn: 301392
Summary:
Expose the internal query structure, start using it.
Note: This is the most minimal change possible i could create. I have
trivial followups, like fixing the one use of const FastMathFlags &,
the renaming of CtxI to be consistent, etc.
This should be NFC.
Reviewers: majnemer, davide
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D32448
llvm-svn: 301379
If Select pseudo instruction doesn't have use SR, then
CMP instructions are being marked as dead and later can be
removed by MachineCSE pass. This leads to incorrect code
generation.
Differential Revision: https://reviews.llvm.org/D32473
llvm-svn: 301372
The order in which GCOV file info is printed depends on the string hash
function. This makes some GCOV tests brittle, because the tests must be
updated whenever the hash function changes.
Sort the filenames before printing out the file info to solve the
problem. This should be relatively cheap.
Differential Revision: https://reviews.llvm.org/D32512
llvm-svn: 301371
Summary:
Addends are used as offsets to addresses of globals
and can be both positive and negative. This change
prints libObject in line with the spec and the MC
layer.
Subscribers: jfb, dschuff
Differential Revision: https://reviews.llvm.org/D32507
llvm-svn: 301369
We were already parsing and dumping this to the human readable
format, but not to the YAML format. This does so, in preparation
for reading it in and reconstructing the line information from
YAML.
llvm-svn: 301357
We already have a function toHex that will convert a string like
"\xFF\xFF" to the string "FFFF", but we do not have one that goes
the other way - i.e. to convert a textual string representing a
sequence of hexadecimal characters into the corresponding actual
bytes. This patch adds such a function.
llvm-svn: 301356
This may trigger a segfault in llvm-objdump when the line number stored
in debug infromation points beyond the end of file; lines in LineBuffer
are stored in std::vector which is allocated in chunks, so even if the
debug info points beyond the end of the file, this doesn't necessarily
trigger the segfault unless the line number points beyond the allocated
space.
Differential Revision: https://reviews.llvm.org/D32466
llvm-svn: 301347
This patch is part of D28975's breakdown.
induction.ll encodes the specific (and rather arbitrary) numbers given to
predicated basic blocks by the unique naming mechanism, which makes it
sensitive to changes in LV's instruction generation order. This patch replaces
those specific numbers with a numeric pattern.
Differential Revision: https://reviews.llvm.org/D32404
llvm-svn: 301345
The code I've removed here exists in ExpandBinOp in InstSimplify which we call into before SimplifyUsingDistributiveLaws. The code in InstSimplify looks to have been copied from here.
I verified this code doesn't fire on any lit tests. Not that that proves its definitely dead.
Differential Revision: https://reviews.llvm.org/D32472
llvm-svn: 301341
This patch uses various APInt methods to reduce temporary APInt creation.
This should be all of the unrelated cleanups that got buried in D32376(creating a KnownBits struct) as well as some pointed out by Simon during the review of that. Plus a few improvements to use counting instead of masking.
I've left out any places where we do something like (KnownZero & KnownOne) != 0 as I plan to add a helper method to KnownBits to ask that question and didn't want to thrash that code an additional time.
Differential Revision: https://reviews.llvm.org/D32495
llvm-svn: 301338
The code Sanjay Patel moved over from InstCombine doesn't work properly if the 'and' has both inputs as nots because we used a commuted op matcher on the 'and' first. But this will bind to the first 'not' on 'and' when there could be two 'not's. InstCombine could rely on DeMorgan to ensure the 'and' wouldn't have two 'not's eventually, but InstSimplify can't rely on that.
This patch matches the xor first then checks for the ands and allows a not of either operand of the xor.
Differential Revision: https://reviews.llvm.org/D32458
llvm-svn: 301329
This is a pre-commit for a patch I'm working on to turn KnownZero/One into a struct. Once I do that the type here will be less obvious.
llvm-svn: 301324
This is a pre-commit for a patch that I'm working on to merge KnownZero/KnownOne into a KnownBits struct which would have had to touch this line.
llvm-svn: 301323
This patch reapplies r301309 with the fix to the MIR test to fix the assertion
triggered by r301309. Had trimmed a little bit too much from the MIR!
llvm-svn: 301317
The matching here wasn't able to handle all the possible commutes. It always assumed the not would be on the left of the xor, but that's not guaranteed.
Differential Revision: https://reviews.llvm.org/D32474
llvm-svn: 301316
Summary: No test case since I'm not aware of an in-tree target that needs this.
Reviewers: hans
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D32398
llvm-svn: 301311
This patch fixes a bug with the updating of DBG_VALUE's in
BreakAntiDependencies. Previously, it would only attempt to update the first
DBG_VALUE following the instruction whose register is being changed,
potentially leaving DBG_VALUE's referring to the wrong register. Now the code
will update all DBG_VALUE's that immediately follow the instruction.
This issue was detected as a result of an optimized codegen difference with
"-g" where an X86 byte/word fixup was not performed due to a DBG_VALUE
referencing the wrong register.
Differential Revision: https://reviews.llvm.org/D31755
llvm-svn: 301309
One of the fast-math optimizations is to replace calls to standard double
functions with their float equivalents, e.g. exp -> expf. However, this can
cause infinite loops for the following:
float expf(float val) { return (float) exp((double) val); }
A similar inline declaration exists in the MinGW-w64 math.h header file which
when compiled with -O2/3 and fast-math generates infinite loops.
So this fix checks that the calling function to the standard double function
that is being replaced does not match the float equivalent.
Differential Revision: https://reviews.llvm.org/D31806
llvm-svn: 301304
Summary:
In a previous change I changed SCEV's normalization / denormalization
to work with non-affine add recs. So the bailout in IVUsers can be
removed.
Reviewers: atrick, efriedma
Reviewed By: atrick
Subscribers: davide, mcrosier, llvm-commits
Differential Revision: https://reviews.llvm.org/D32105
llvm-svn: 301298
This patch is part of D28975's breakdown.
Genreating the control-flow to guard predicated instructions modified to
only use SplitBlockAndInsertIfThen() for producing the if-then construct.
Differential Revision: https://reviews.llvm.org/D32224
llvm-svn: 301293
We need to do this to prevent a miscompile which sinks an objc_retain
past an objc_release that releases the object objc_retain retains. This
happens because the top-down and bottom-up traversals each determines
the insert point for retain or release individually without knowing
where the other instruction is moved.
For example, when the following IR is fed to the ARC optimizer, the
top-down traversal decides to insert objc_retain right before
objc_release and the bottom-up traversal decides to insert objc_release
right after clang.arc.use.
(IR before ARC optimizer)
%11 = call i8* @objc_retain(i8* %10)
call void (...) @clang.arc.use(%0* %5)
call void @llvm.dbg.value(...)
call void @objc_release(i8* %6)
This reverses the order of objc_release and objc_retain, which causes
the object to be destructed prematurely.
(IR after ARC optimizer)
call void (...) @clang.arc.use(%0* %5)
call void @objc_release(i8* %6)
call void @llvm.dbg.value(...)
%11 = call i8* @objc_retain(i8* %10)
rdar://problem/30530580
llvm-svn: 301289
Summary:
Before this change, SCEV Normalization would incorrectly normalize
non-affine add recurrences. To work around this there was (still is)
a check in place to make sure we only tried to normalize affine add
recurrences.
We recently found a bug in aforementioned check to bail out of
normalizing non-affine add recurrences. However, instead of fixing
the bailout, I have decided to teach SCEV normalization to work
correctly with non-affine add recurrences, making the bailout
unnecessary (I'll remove it in a subsequent change).
I've also added some unit tests (which would have failed before this
change).
Reviewers: atrick, sunfish, efriedma
Reviewed By: atrick
Subscribers: mcrosier, mzolotukhin, llvm-commits
Differential Revision: https://reviews.llvm.org/D32104
llvm-svn: 301281
I'm proposing a fold for increment-of-sexted-bool in:
https://reviews.llvm.org/D31944
...so we need to know what happens in more cases like these.
llvm-svn: 301269
Remove the temporary, poorly named getSlotSet method which did the same
thing. Also remove getSlotNode, which is a hold-over from when we were
dealing with AttributeSetNode* instead of AttributeSet.
llvm-svn: 301267
Summary:
`git apply` on Windows doesn't work for files that SVN checks out as
CRLF. There is no way to force SVN to check everything out with Unix
line endings on Windows. Files with svn:eol-style=native will always
come out with CRLF, breaking `git apply`, which wants Unix line endings.
My workaround is to list all files with this property set in the change,
and run `dos2unix` on them. SVN doesn't commit a massive line ending
change because the svn:eol-style property indicates that these are text
files.
Tested on r301245.
Reviewers: zturner, jlebar
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D32452
llvm-svn: 301262
We can simplify (and (icmp X, C1), (icmp X, C2)) to one of the icmps in many cases.
I had to check some of these with Alive to prove to myself it's right, but everything
seems to check out. Eg, the code in instcombine was completely ignoring predicates with
mismatched signedness.
Handling or-of-icmps would be a follow-up step.
Differential Revision: https://reviews.llvm.org/D32143
llvm-svn: 301260
Rely on MachineRegisterInfo's knowledge of used physical
registers.
Move flat_scratch initialization earlier, so the uses are visible
when making these decisions.
This will make it easier to add another reserved register
at the end for the stack pointer rather than handling another
special case.
llvm-svn: 301254
Summary:
That API creates a temporary AttributeList to carry an index and a
single AttributeSet. We need to carry the index in addition to the set,
because that is how attribute groups are currently encoded.
NFC
Reviewers: pcc
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D32262
llvm-svn: 301245
Summary:
Ensure that the new merge BB (which contains the rest of the original BB
after the mem op being optimized) gets a profile frequency, in case
there are additional mem ops later in the BB. Otherwise they get skipped
as the merge BB looks cold.
Reviewers: davidxl, xur
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D32447
llvm-svn: 301244
This reverts commit r300732. This breaks a few tests.
I think the problem is related to adding more uses of
the condition that don't yet exist at this point.
llvm-svn: 301242
The current Loop Unroll implementation works with loops having a
single latch that contains a conditional branch to a block outside
the loop (the other successor is, by defition of latch, the header).
If this precondition doesn't hold, avoid unrolling the loop as
the code is not ready to handle such circumstances.
Differential Revision: https://reviews.llvm.org/D32261
llvm-svn: 301239
Use constant references rather than `const auto` which will cause the
copy constructor. These particular cases cause issues for the swift
compiler.
llvm-svn: 301237
libraries are properly unloaded when llvm_shutdown is called.
Summary:
This was mostly affecting usage of the JIT, where storing the library handles in
a set made iteration unordered/undefined. This lead to disagreement between the
JIT and native code as to what the address and implementation of particularly on
Windows with stdlib functions:
JIT: putenv_s("TEST", "VALUE") // called msvcrt.dll, putenv_s
JIT: getenv("TEST") -> "VALUE" // called msvcrt.dll, getenv
Native: getenv("TEST") -> NULL // called ucrt.dll, getenv
Also fixed is the issue of DynamicLibrary::getPermanentLibrary(0,0) on Windows
not giving priority to the process' symbols as it did on Unix.
Reviewers: chapuni, v.g.vassilev, lhames
Reviewed By: lhames
Subscribers: danalbert, srhines, mgorny, vsk, llvm-commits
Differential Revision: https://reviews.llvm.org/D30107
llvm-svn: 301236
In call sequence setups, there may not be a frame index base
and the pointer is a constant offset from the frame
pointer / scratch wave offset register.
llvm-svn: 301230
Merges equivalent initializations of M0 and hoists them into a common
dominator block. Technically the same code can be used with any
register, physical or virtual.
Differential Revision: https://reviews.llvm.org/D32279
llvm-svn: 301228
Summary:
llvm.invariant.group.barrier returns pointer that mustalias
pointer it takes. It can't be marked with `returned` attribute,
because it would be remove easily. The other reason is that
only Alias Analysis can know about this, because if any other
pass would know it, then the result would be replaced with it's
argument, which would be invalid.
We can think about returned pointer as something that mustalias, but
it doesn't have to be bitwise the same as the argument.
Reviewers: dberlin, chandlerc, hfinkel, sanjoy
Subscribers: reames, nlewycky, rsmith, anna, amharc
Differential Revision: https://reviews.llvm.org/D31585
llvm-svn: 301227
1. RegisterClass::getSize() is split into two functions:
- TargetRegisterInfo::getRegSizeInBits(const TargetRegisterClass &RC) const;
- TargetRegisterInfo::getSpillSize(const TargetRegisterClass &RC) const;
2. RegisterClass::getAlignment() is replaced by:
- TargetRegisterInfo::getSpillAlignment(const TargetRegisterClass &RC) const;
This will allow making those values depend on subtarget features in the
future.
Differential Revision: https://reviews.llvm.org/D31783
llvm-svn: 301221
Summary: In rL297945, jhenderson added methods for setting permissions
to sys::fs, but some of the unittests that attempt to set sticky bits
(01000) on files fail on modern BSDs, such as FreeBSD, NetBSD and
OpenBSD. This is because those systems do not allow regular users to
set sticky bits on files, only on directories. Fix it by disabling
these particular tests on modern BSDs.
Reviewers: emaste, brad, jhenderson
Reviewed By: jhenderson
Subscribers: joerg, krytarowski, llvm-commits
Differential Revision: https://reviews.llvm.org/D32120
llvm-svn: 301220
When functions are terminated by unreachable instructions, the last
instruction might trigger a CFI instruction to be generated. However,
emitting it would be be illegal since the function (and thus the FDE
the CFI is in) has already ended with the previous instruction.
Darwin's dwarfdump --verify --eh-frame complains about this and the
specification supports this.
Relevant bits from the DWARF 5 standard (6.4 Call Frame Information):
"[The] address_range [field in an FDE]: The number of bytes of
program instructions described by this entry."
"Row creation instructions: [...]
The new location value is always greater than the current one."
The first quotation implies that a CFI cannot describe a target
address outside of the enclosing FDE's range.
rdar://problem/26244988
Differential Revision: https://reviews.llvm.org/D32246
llvm-svn: 301219
Currently the operand type for ATOMIC_FENCE assumes value type of a pointer in address space 0.
This is fine for most targets. However for amdgcn target, the size of pointer in address space 0
depends on triple environment. For amdgiz environment, it is 64 bit but for other environment it is
32 bit. On the other hand, amdgcn target expects 32 bit fence operands independent of the target
triple environment. Therefore a hook is need in target lowering for getting the fence operand type.
This patch has no effect on targets other than amdgcn.
Differential Revision: https://reviews.llvm.org/D32186
llvm-svn: 301215
This is a straight cut and paste, but there's a bigger problem: if this
fold exists for simplifyOr, there should be a DeMorganized version for
simplifyAnd. But more than that, we have a patchwork of ad hoc logic
optimizations in InstCombine. There should be some structure to ensure
that we're not missing sibling folds across and/or/xor.
llvm-svn: 301213
Re-Commit of r300922 and r300923 with less aggressive assert (see
discussion at the end of https://reviews.llvm.org/D32205)
X86RegisterInfo::eliminateFrameIndex() and
X86FrameLowering::getFrameIndexReference() both had logic to compute the
base register. This consolidates the code.
Also use MachineInstr::isReturn instead of manually enumerating tail
call instructions (return instructions were not included in the previous
list because they never reference frame indexes).
Differential Revision: https://reviews.llvm.org/D32206
llvm-svn: 301211
When the location description of a source variable involves arithmetic
on the value itself, it needs to be marked with DW_OP_stack_value since it
is not describing the variable's location, but rather its value.
This is a follow-up to r297971 and fixes the source testcase quoted in
the comment in debuginfo-dce.ll.
rdar://problem/30725338
This reapplies r301093 without modifications.
llvm-svn: 301210
Fixes traps in any block besides the entry block,
and fixes depending on a live-in physical register
by using a virtual register copy.
Also happens to stop emitting a nop in the case
debug trap is not supported.
llvm-svn: 301206
The *real* difference between these two was that
a) The "graphical" dumper could recurse, while the text one could
not.
b) The "text" dumper could display nested types and functions,
while the graphical one could not.
Merge these two so that there is only one dumper that can recurse
arbitrarily deep and optionally display nested types or not.
llvm-svn: 301204
This reworks the way virtual bases are handled, and also the way
padding is detected across multiple levels of aggregates, producing
a much more accurate result.
llvm-svn: 301203
This replaces a hand written copy loop with a call to memcpy for both zext and sext.
For sext, it replaces multiple if/else blocks propagating sign information forward. Now we just do a copy, a sign extension on the last copied word, a memset, and clearUnusedBits.
Differential Revision: https://reviews.llvm.org/D32417
llvm-svn: 301201
There is logic to track the expected number of instructions
produced. It thought in this case an instruction would
be necessary to negate the result, but here it folded
into a ConstantExpr fneg when the non-undef value operand
was cancelled out by the second fsub.
I'm not sure why we don't fold constant FP ops with undef currently,
but I think that would also avoid this problem.
llvm-svn: 301199
This patch adds an in place version of ashr to match lshr and shl which were recently added.
I've tried to make this similar to the lshr code with additions to handle the sign extension. I've also tried to do this with less if checks than the current ashr code by sign extending the original result to a word boundary before doing any of the shifting. This removes a lot of the complexity of determining where to fill in sign bits after the shifting.
Differential Revision: https://reviews.llvm.org/D32415
llvm-svn: 301198
Summary:
Fix a compiler bug when the lane select happens to end up in a VGPR.
Clarify the semantic of the corresponding intrinsic to be that of
the corresponding GLSL: the lane select must be uniform across a
wave front, otherwise results are undefined.
Reviewers: arsenm
Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Differential Revision: https://reviews.llvm.org/D32343
llvm-svn: 301197
Summary:
Instead of keeping a variable indicating whether there are early exits
in the loop. We keep all the early exits. This improves LICM's ability to
move instructions out of the loop based on is-guaranteed-to-execute.
I am going to update compilation time as well soon.
Reviewers: hfinkel, sanjoy, efriedma, mkuper
Reviewed By: hfinkel
Subscribers: llvm-commits, mzolotukhin
Differential Revision: https://reviews.llvm.org/D32433
llvm-svn: 301196
Summary:
The return value of these intrinsics should always have 0 bits for
inactive threads. This means that when all arguments are constant
and the comparison evaluates to true, the intrinsic should return
the current exec mask.
Fixes some GL_ARB_shader_ballot tests.
Reviewers: arsenm
Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, llvm-commits, t-tye
Differential Revision: https://reviews.llvm.org/D32344
llvm-svn: 301195
While we use BaseIndexOffset in FindBetterNeighborChains to
appropriately realize they're almost the same address and should be
improved concurrently we do not use it in isAlias using the non-index
understanding FindBaseOffset instead. Adding a BaseIndexOffset check
in isAlias like should allow indexed stores to be merged.
FindBaseOffset to be excised in subsequent patch.
Reviewers: jyknight, aditya_nandakumar, bogner
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D31987
llvm-svn: 301187
Summary: Region objects capture the address of the creating RegionInfo instance. Because the RegionInfo class is movable, moving a RegionInfo object creates dangling references. This patch fixes these references by walking the Regions post-move, and updating references to the new parent.
Reviewers: Meinersbur, grosser
Reviewed By: Meinersbur, grosser
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D31719
llvm-svn: 301175
Summary: SUSE's ARM triples end with -gnueabi even though they are hard-float. This requires special handling of SUSE ARM triples. Hence we need a way to differentiate the SUSE as vendor. This CL adds that.
Reviewers: chandlerc, compnerd, echristo, rengolin
Reviewed By: rengolin
Subscribers: aemerson, rengolin, llvm-commits
Differential Revision: https://reviews.llvm.org/D32426
llvm-svn: 301174
I found this when investigated "Bug 32319 - .gdb_index is broken/incomplete" for LLD.
When we have object file with .debug_ranges section it may be filled with zeroes.
Relocations are exist in file to relocate this zeroes into real values later, but until that
a pair of zeroes is treated as terminator. And DWARF parser thinks there is no ranges at all
when I am trying to collect address ranges for building .gdb_index.
Solution implemented in this patch is to take relocations in account when parsing ranges.
Differential revision: https://reviews.llvm.org/D32228
llvm-svn: 301170
We have to widen the operands to 32 bits and then we can either use
hardware division if it is available or lower to a libcall otherwise.
At the moment it is not enough to set the Legalizer action to
WidenScalar, since for libcalls it won't know what to do (it won't be
able to find what size to widen to, because it will find Libcall and not
Legal for 32 bits). To hack around this limitation, we request Custom
lowering, and as part of that we widen first and then we run another
legalizeInstrStep on the widened DIV.
llvm-svn: 301166
Instruction isb takes as an operand either 'sy' or an immediate value. This
improves the diagnostic when the string is not 'sy' and adds a test case for
this which was missing. This also adds tests to check invalid inputs for dsb
and dmb.
Differential Revision: https://reviews.llvm.org/D32227
llvm-svn: 301165
Add support for both targets with hardware division and without. For
hardware division we have to add support throughout the pipeline
(legalizer, reg bank select, instruction select). For targets without
hardware division, we only need to mark it as a libcall.
llvm-svn: 301164
Treat them the same as the other binary operations that we have so far,
but on integers rather than floating point types. Extract the common
code into a helper.
This will be used in the ARM backend.
llvm-svn: 301163
When selecting a G_CONSTANT to a MOVi, we need the value to be an Imm
operand. We used to just leave the G_CONSTANT operand unchanged, which
works in some cases (such as the GEP offsets that we create when
referring to stack slots). However, in many other places the G_CONSTANTs
are created with CImm operands. This patch makes sure to handle those as
well, and to error out gracefully if in the end we don't end up with an
Imm operand.
Thanks to Oliver Stannard for reporting this issue.
llvm-svn: 301162
Summary:
This is a tool for comparing the function graphs produced by the
llvm-xray graph too. It takes the form of a new subcommand of the
llvm-xray tool 'graph-diff'.
This initial version of the patch is very rough, but it is close to
feature complete.
Depends on D29363
Reviewers: dblaikie, dberris
Reviewed By: dberris
Subscribers: mgorny, llvm-commits
Differential Revision: https://reviews.llvm.org/D29320
llvm-svn: 301160
Previously single word would always return 0 regardless of the original sign. Multi word would return all 0s or all 1s based on the original sign. Now single word takes into account the sign as well.
llvm-svn: 301159
The changes are causing the i686-mingw32 build to fail.
This reverts commit r301153, and the changes for a separate warning on i686-mingw32 in r301155 and r301156.
llvm-svn: 301157
libraries are properly unloaded when llvm_shutdown is called.
Summary:
This was mostly affecting usage of the JIT, where storing the library handles in
a set made iteration unordered/undefined. This lead to disagreement between the
JIT and native code as to what the address and implementation of particularly on
Windows with stdlib functions:
JIT: putenv_s("TEST", "VALUE") // called msvcrt.dll, putenv_s
JIT: getenv("TEST") -> "VALUE" // called msvcrt.dll, getenv
Native: getenv("TEST") -> NULL // called ucrt.dll, getenv
Also fixed is the issue of DynamicLibrary::getPermanentLibrary(0,0) on Windows
not giving priority to the process' symbols as it did on Unix.
Reviewers: chapuni, v.g.vassilev, lhames
Reviewed By: lhames
Subscribers: danalbert, srhines, mgorny, vsk, llvm-commits
Differential Revision: https://reviews.llvm.org/D30107
llvm-svn: 301153
This change reboots SCEV's current (off by default) verification logic
to avoid false failures. Instead of stringifying trip counts, it maps
old and new trip counts to the same ScalarEvolution "universe" and
asks ScalarEvolution to compute the difference between them. If the
difference comes out to be a non-zero constant, then (barring some
corner cases) we *know* we messed up.
I've not yet enabled this by default since it hits an exponential time
issue in SCEV, but once I fix that, I'll flip it on by default in
EXPENSIVE_CHECKS builds.
llvm-svn: 301146
We handled all of the commuted variants for plain xor already,
although they were scattered around and sometimes folded less
efficiently using distributive laws. We had no folds for not-xor.
Handling all of these patterns consistently is part of trying to
reinstate:
https://reviews.llvm.org/rL300977
llvm-svn: 301144