The IntrNoMem, IntrReadMem, IntrWriteMem, and IntrArgMemOnly intrinsic
properties differ from their corresponding LLVM IR attributes by specifying
that the intrinsic, in addition to its memory properties, has no other side
effects.
The IntrHasSideEffects flag used in combination with one of the memory flags
listed above, makes it possible to define an intrinsic such that its
properties at the CodeGen layer match its properties at the IR layer.
Patch by Tom Stellard
llvm-svn: 301685
The method is called "get *Param* Alignment", and is only used for
return values exactly once, so it should take argument indices, not
attribute indices.
Avoids confusing code like:
IsSwiftError = CS->paramHasAttr(ArgIdx, Attribute::SwiftError);
Alignment = CS->getParamAlignment(ArgIdx + 1);
Add getRetAlignment to handle the one case in Value.cpp that wants the
return value alignment.
This is a potentially breaking change for out-of-tree backends that do
their own call lowering.
llvm-svn: 301682
Adds a new method finalizeLowering to TargetLoweringBase. This is in
preparation for an upcoming commit.
This function is meant for target specific adjustments to
MachineFrameInfo or register reservations.
Move the freezeRegisters() and the hasCopyImplyingStackAdjustment()
handling into the new function to prove the concept. As an added bonus
GlobalISel no longer missed the hasCopyImplyingStackAdjustment()
handling with this.
Differential Revision: https://reviews.llvm.org/D32621
llvm-svn: 301679
Summary:
When using ThinLTO, the linker performs its own parallelism. This
change limits the number of parallel link jobs that Ninja will issue
to keep the total number of threads reasonable when linking with
ThinLTO.
Reviewers: hans, ruiu
Subscribers: mgorny, mehdi_amini, Prazek
Differential Revision: https://reviews.llvm.org/D31990
llvm-svn: 301676
This eliminates many extra 'Idx' induction variables in loops over
arguments in CodeGen/ and Target/. It also reduces the number of places
where we assume that ReturnIndex is 0 and that we should add one to
argument numbers to get the corresponding attribute list index.
NFC
llvm-svn: 301666
This became no longer necessary after D19462 landed, and will be incompatible
with an upcoming change to the summary data structures that changes how we
represent references.
llvm-svn: 301660
. swap 4-bit register encoding, 16-bit offset and 32-bit imm to support big endian archs
. add a test
Reported-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
llvm-svn: 301653
Summary:
The motivation example is like below which has 13 cases but only 2 distinct targets
```
lor.lhs.false2: ; preds = %if.then
switch i32 %Status, label %if.then27 [
i32 -7012, label %if.end35
i32 -10008, label %if.end35
i32 -10016, label %if.end35
i32 15000, label %if.end35
i32 14013, label %if.end35
i32 10114, label %if.end35
i32 10107, label %if.end35
i32 10105, label %if.end35
i32 10013, label %if.end35
i32 10011, label %if.end35
i32 7008, label %if.end35
i32 7007, label %if.end35
i32 5002, label %if.end35
]
```
which is compiled into a balanced binary tree like this on AArch64 (similar on X86)
```
.LBB853_9: // %lor.lhs.false2
mov w8, #10012
cmp w19, w8
b.gt .LBB853_14
// BB#10: // %lor.lhs.false2
mov w8, #5001
cmp w19, w8
b.gt .LBB853_18
// BB#11: // %lor.lhs.false2
mov w8, #-10016
cmp w19, w8
b.eq .LBB853_23
// BB#12: // %lor.lhs.false2
mov w8, #-10008
cmp w19, w8
b.eq .LBB853_23
// BB#13: // %lor.lhs.false2
mov w8, #-7012
cmp w19, w8
b.eq .LBB853_23
b .LBB853_3
.LBB853_14: // %lor.lhs.false2
mov w8, #14012
cmp w19, w8
b.gt .LBB853_21
// BB#15: // %lor.lhs.false2
mov w8, #-10105
add w8, w19, w8
cmp w8, #9 // =9
b.hi .LBB853_17
// BB#16: // %lor.lhs.false2
orr w9, wzr, #0x1
lsl w8, w9, w8
mov w9, #517
and w8, w8, w9
cbnz w8, .LBB853_23
.LBB853_17: // %lor.lhs.false2
mov w8, #10013
cmp w19, w8
b.eq .LBB853_23
b .LBB853_3
.LBB853_18: // %lor.lhs.false2
mov w8, #-7007
add w8, w19, w8
cmp w8, #2 // =2
b.lo .LBB853_23
// BB#19: // %lor.lhs.false2
mov w8, #5002
cmp w19, w8
b.eq .LBB853_23
// BB#20: // %lor.lhs.false2
mov w8, #10011
cmp w19, w8
b.eq .LBB853_23
b .LBB853_3
.LBB853_21: // %lor.lhs.false2
mov w8, #14013
cmp w19, w8
b.eq .LBB853_23
// BB#22: // %lor.lhs.false2
mov w8, #15000
cmp w19, w8
b.ne .LBB853_3
```
However, the inline cost model estimates the cost to be linear with the number
of distinct targets and the cost of the above switch is just 2 InstrCosts.
The function containing this switch is then inlined about 900 times.
This change use the general way of switch lowering for the inline heuristic. It
etimate the number of case clusters with the suitability check for a jump table
or bit test. Considering the binary search tree built for the clusters, this
change modifies the model to be linear with the size of the balanced binary
tree. The model is off by default for now :
-inline-generic-switch-cost=false
This change was originally proposed by Haicheng in D29870.
Reviewers: hans, bmakam, chandlerc, eraman, haicheng, mcrosier
Reviewed By: hans
Subscribers: joerg, aemerson, llvm-commits, rengolin
Differential Revision: https://reviews.llvm.org/D31085
llvm-svn: 301649
Summary:
Skip memops if the total value profiled count is 0, we can't correctly
scale up the counts and there is no point anyway.
Reviewers: davidxl
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D32624
llvm-svn: 301645
Reapplied r299221 after fix for nondeterminism in ThinLTO builder (rL301599), with extra check for implicit truncation of inserted element.
llvm-svn: 301644
Declare the ARMInstructionSelector in an anonymous namespace, to make it
more in line with the other targets which were migrated to this in
r299637 in order to avoid TableGen'erated headers being included in
non-GlobalISel builds.
llvm-svn: 301632
There was a garbage character in output introduced by myself in
r290040 "[DWARF] - Introduce DWARFDebugPubTable class for dumping pub* sections."
llvm-svn: 301631
This is a follow up to the fix in r298360 to improve the handling of debug
values when redundant LEAs are removed. The fix in r298360 effectively
discarded the debug values. This patch now attempts to preserve the debug
values by using the DWARF DW_OP_stack_value operation via prependDIExpr.
Moved functions appendOffset and prependDIExpr from Local.cpp to
DebugInfoMetadata.cpp and made them available as static member functions of
DIExpression.
Differential Revision: https://reviews.llvm.org/D31604
llvm-svn: 301630
EarlyCSE should not just ignore assumes. It should use the fact that its condition is true for all dominated instructions.
Reviewers: sanjoy, reames, apilipenko, anna, skatkov
Reviewed By: reames, sanjoy
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D32482
llvm-svn: 301625
If a condition is calculated only once, and there are multiple guards on this condition, we should be able
to remove all guards dominated by the first of them. This patch allows EarlyCSE to try to find the condition
of a guard among the known values, and if it is true, remove the guard. Otherwise we keep the guard and
mark its condition as 'true' for future consideration.
Reviewers: sanjoy, reames, apilipenko, skatkov, anna, dberlin
Reviewed By: reames, sanjoy
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D32476
llvm-svn: 301623
This patch replaces the separate APInts for KnownZero/KnownOne with a single KnownBits struct. This is similar to what was done to ValueTracking's version recently.
This is largely a mechanical transformation from KnownZero to Known.Zero.
Differential Revision: https://reviews.llvm.org/D32569
llvm-svn: 301620
This patch uses various APInt methods to reduce the number of temporary APInts. These were all found while working through converting SelectionDAG's computeKnownBits to also use the KnownBits struct recently added to the ValueTracking version.
llvm-svn: 301618
Summary:
In some cases LLVM (especially the SLP vectorizer) will create vectors
that are 256 bytes (or larger). Given that this is intentional[0] is
likely to get more common, this patch updates the StackMap binary
format to deal with the spill locations for said vectors.
This change also bumps the stack map version from 2 to 3.
[0]: https://reviews.llvm.org/D32533#738350
Reviewers: reames, kavon, skatkov, javed.absar
Subscribers: mcrosier, nemanjai, llvm-commits
Differential Revision: https://reviews.llvm.org/D32629
llvm-svn: 301615
COFF Import libraries which use the obsolete CONSTANT export are
supposed to get two symbols, one with the `_imp_` prefix and one
without. Ensure that we expose both for iteration. This is necessary
to fix the librarian with COFF CONSTANT exports.
llvm-svn: 301614
When dumping raw data from a stream, you might know the offset
of a certain record you're interested in, as well as how long
that record is. Previously, you had to dump the entire stream
and wade through the bytes to find the interesting record.
This patch allows you to specify an offset and length on the
command line, and it will only dump the requested range.
llvm-svn: 301607
Reviewers: zturner, hansw, hans
Reviewed By: hans
Subscribers: hans, llvm-commits
Differential Revision: https://reviews.llvm.org/D32611
llvm-svn: 301595
Use a combination of !associated, comdat, @llvm.compiler.used and
custom sections to allow dead stripping of globals and their asan
metadata. Sometimes.
Currently this works on LLD, which supports SHF_LINK_ORDER with
sh_link pointing to the associated section.
This also works on BFD, which seems to treat comdats as
all-or-nothing with respect to linker GC. There is a weird quirk
where the "first" global in each link is never GC-ed because of the
section symbols.
At this moment it does not work on Gold (as in the globals are never
stripped).
This is a second re-land of r298158. This time, this feature is
limited to -fdata-sections builds.
llvm-svn: 301587
When possible, put ASan ctor/dtor in comdat.
The only reason not to is global registration, which can be
TU-specific. This is not the case when there are no instrumented
globals. This is also limited to ELF targets, because MachO does
not have comdat, and COFF linkers may GC comdat constructors.
The benefit of this is a lot less __asan_init() calls: one per DSO
instead of one per TU. It's also necessary for the upcoming
gc-sections-for-globals change on Linux, where multiple references to
section start symbols trigger quadratic behaviour in gold linker.
This is a second re-land of r298756. This time with a flag to disable
the whole thing to avoid a bug in the gold linker:
https://sourceware.org/bugzilla/show_bug.cgi?id=19002
llvm-svn: 301586
This patch dumps the raw bytes of the .rsrc sections that
are present in COFF object and executable files. Subsequent
patches will parse this information and dump in a more human
readable format.
Differential Revision: https://reviews.llvm.org/D32463
Patch By: Eric Beckmann
llvm-svn: 301578
Currently, this pass only focuses on *trivial* loop unswitching. At that
reduced problem it remains significantly better than the current loop
unswitch:
- Old pass is worse than cubic complexity. New pass is (I think) linear.
- New pass is much simpler in its design by focusing on full unswitching. (See
below for details on this).
- New pass doesn't carry state for thresholds between pass iterations.
- New pass doesn't carry state for correctness (both miscompile and
infloop) between pass iterations.
- New pass produces substantially better code after unswitching.
- New pass can handle more trivial unswitch cases.
- New pass doesn't recompute the dominator tree for the entire function
and instead incrementally updates it.
I've ported all of the trivial unswitching test cases from the old pass
to the new one to make sure that major functionality isn't lost in the
process. For several of the test cases I've worked to improve the
precision and rigor of the CHECKs, but for many I've just updated them
to handle the new IR produced.
My initial motivation was the fact that the old pass carried state in
very unreliable ways between pass iterations, and these mechansims were
incompatible with the new pass manager. However, I discovered many more
improvements to make along the way.
This pass makes two very significant assumptions that enable most of these
improvements:
1) Focus on *full* unswitching -- that is, completely removing whatever
control flow construct is being unswitched from the loop. In the case
of trivial unswitching, this means removing the trivial (exiting)
edge. In non-trivial unswitching, this means removing the branch or
switch itself. This is in opposition to *partial* unswitching where
some part of the unswitched control flow remains in the loop. Partial
unswitching only really applies to switches and to folded branches.
These are very similar to full unrolling and partial unrolling. The
full form is an effective canonicalization, the partial form needs
a complex cost model, cannot be iterated, isn't canonicalizing, and
should be a separate pass that runs very late (much like unrolling).
2) Leverage LLVM's Loop machinery to the fullest. The original unswitch
dates from a time when a great deal of LLVM's loop infrastructure was
missing, ineffective, and/or unreliable. As a consequence, a lot of
complexity was added which we no longer need.
With these two overarching principles, I think we can build a fast and
effective unswitcher that fits in well in the new PM and in the
canonicalization pipeline. Some of the remaining functionality around
partial unswitching may not be relevant today (not many test cases or
benchmarks I can find) but if they are I'd like to add support for them
as a separate layer that runs very late in the pipeline.
Purely to make reviewing and introducing this code more manageable, I've
split this into first a trivial-unswitch-only pass and in the next patch
I'll add support for full non-trivial unswitching against a *fixed*
threshold, exactly like full unrolling. I even plan to re-use the
unrolling thresholds, as these are incredibly similar cost tradeoffs:
we're cloning a loop body in order to end up with simplified control
flow. We should only do that when the total growth is reasonably small.
One of the biggest changes with this pass compared to the previous one
is that previously, each individual trivial exiting edge from a switch
was unswitched separately as a branch. Now, we unswitch the entire
switch at once, with cases going to the various destinations. This lets
us unswitch multiple exiting edges in a single operation and also avoids
numerous extremely bad behaviors, where we would introduce 1000s of
branches to test for thousands of possible values, all of which would
take the exact same exit path bypassing the loop. Now we will use
a switch with 1000s of cases that can be efficiently lowered into
a jumptable. This avoids relying on somehow forming a switch out of the
branches or getting horrible code if that fails for any reason.
Another significant change is that this pass actively updates the CFG
based on unswitching. For trivial unswitching, this is actually very
easy because of the definition of loop simplified form. Doing this makes
the code coming out of loop unswitch dramatically more friendly. We
still should run loop-simplifycfg (at the least) after this to clean up,
but it will have to do a lot less work.
Finally, this pass makes much fewer attempts to simplify instructions
based on the unswitch. Something like loop-instsimplify, instcombine, or
GVN can be used to do increasingly powerful simplifications based on the
now dominating predicate. The old simplifications are things that
something like loop-instsimplify should get today or a very, very basic
loop-instcombine could get. Keeping that logic separate is a big
simplifying technique.
Most of the code in this pass that isn't in the old one has to do with
achieving specific goals:
- Updating the dominator tree as we go
- Unswitching all cases in a switch in a single step.
I think it is still shorter than just the trivial unswitching code in
the old pass despite having this functionality.
Differential Revision: https://reviews.llvm.org/D32409
llvm-svn: 301576
Just calling dropAllReferences leaves pointers to the ConstantExpr
behind, so we would eventually crash with a null pointer dereference.
Differential Revision: https://reviews.llvm.org/D32551
llvm-svn: 301575
Summary:
Misc improvements to debug output. Fix a couple typos and also dump the
value profile before we make any profitability checks.
Reviewers: davidxl
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D32607
llvm-svn: 301574
Generate the better include paths. Instead of #include <llvm_header.h> doxygen
produces #include "llvm/Folder/llvm_header.h"
Patch by Yuka Takahashi (D32342)!
llvm-svn: 301569
Summary:
The type of the target frame index is intptr, not the type of the value we're
going to store into it. Without this change we crash in the attached test case
when trying to type-legalize a TargetFrameIndex.
Patchpoint lowering types the target frame index as intptr as well.
Reviewers: reames, bogner, arsenm
Subscribers: arsenm, mcrosier, llvm-commits
Differential Revision: https://reviews.llvm.org/D32256
llvm-svn: 301566
libraries are properly unloaded when llvm_shutdown is called.
Summary:
This was mostly affecting usage of the JIT, where storing the library handles in
a set made iteration unordered/undefined. This lead to disagreement between the
JIT and native code as to what the address and implementation of particularly on
Windows with stdlib functions:
JIT: putenv_s("TEST", "VALUE") // called msvcrt.dll, putenv_s
JIT: getenv("TEST") -> "VALUE" // called msvcrt.dll, getenv
Native: getenv("TEST") -> NULL // called ucrt.dll, getenv
Also fixed is the issue of DynamicLibrary::getPermanentLibrary(0,0) on Windows
not giving priority to the process' symbols as it did on Unix.
Reviewers: chapuni, v.g.vassilev, lhames
Reviewed By: lhames
Subscribers: danalbert, srhines, mgorny, vsk, llvm-commits
Differential Revision: https://reviews.llvm.org/D30107
llvm-svn: 301562
Previously parsing of these were all grouped together into a
single master class that could parse any type of debug info
fragment.
With writing forthcoming, the complexity of each individual
fragment is enough to warrant them having their own classes so
that reading and writing of each fragment type can be grouped
together, but isolated from the code for reading and writing
other fragment types.
In doing so, I found a place where parsing code was duplicated
for the FileChecksums fragment, across llvm-readobj and the
CodeView library, and one of the implementations had a bug.
Now that the codepaths are merged, the bug is resolved.
Differential Revision: https://reviews.llvm.org/D32547
llvm-svn: 301557
We have a lot of very similarly named classes related to
dealing with module debug info. This patch has NFC, it just
renames some classes to be more descriptive (albeit slightly
more to type). The mapping from old to new class names is as
follows:
Old | New
ModInfo | DbiModuleDescriptor
ModuleSubstream | ModuleDebugFragment
ModStream | ModuleDebugStream
With the corresponding Builder classes renamed accordingly.
Differential Revision: https://reviews.llvm.org/D32506
llvm-svn: 301555
Author: milena.vujosevic.janicic
Reviewers: sdardis
The code implements size reduction pass for MicroMIPS.
Load and store instructions are examined and transformed, if possible.
lw32 instruction is transformed into 16-bit instruction lwsp
sw32 instruction is transformed into 16-bit instruction swsp
Arithmetic instrcutions are examined and transformed, if possible.
addu32 instruction is transformed into 16-bit instruction addu16
subu32 instruction is transformed into 16-bit instruction subu16
Differential Revision: https://reviews.llvm.org/D15144
llvm-svn: 301540
In getCmpSelInstrCost(), CondTy may actually be scalar while ValTy is a
vector when LoopVectorizer is the caller. Therefore the assert that CondTy
must be a vector type if ValTy is was wrong and is now removed.
Review: Ulrich Weigand
llvm-svn: 301533
Fix a crash when trying to extend a value passed as a sign- or
zero-extended stack parameter. The cause of the crash was that we were
setting the size of the loaded value to 32 bits, and then tyring to
extend again to 32 bits.
This patch addresses the issue by also introducing a G_TRUNC after the
load. This will leave the unused bits to their original values set by
the caller, while being consistent about the types. For values that are
not extended, we just use a smaller load.
llvm-svn: 301531
It is useful to output size of ranges when address ranges
section of .gdb_index is dumped.
It helps to compare outputs produced by different linkers,
for example. In that case address ranges can look very different,
when they are the same at fact. Difference comes from different
low address because of different address of .text.
Differential revision: https://reviews.llvm.org/D32492
llvm-svn: 301527
'Src' looks like it was borrowed from memcpy, 'Val' makes more sense for
memset and is consistent with naming within the function.
Differential Revision: https://reviews.llvm.org/D32580
llvm-svn: 301521
This changes code that touches ValueHandleBase::V to go through
getValPtr and (newly added) setValPtr. This functionality will be
used later, but also seemed like a generally good cleanup.
I also renamed the field to Val, but that's just to make it obvious
that I fixed all the uses.
llvm-svn: 301518
Previously, an expression such as Saver.save(std::string("foo") + "bar")
didn't compile because there is an ambiguity as to whether the argument
is of const Twine& or StringRef.
llvm-svn: 301512
also a discussion about exactly what we should do prior to re-enabling
it.
The current bug is http://llvm.org/PR32821 and the discussion about this
is in the review thread for r300200.
llvm-svn: 301505
DISubprogram currently has 10 pointer operands, several of which are
often nullptr. This patch reduces the amount of memory allocated by
DISubprogram by rearranging the operands such that containing type,
template params, and thrown types come last, and are only allocated
when they are non-null (or followed by non-null operands).
This patch also eliminates the entirely unused DisplayName operand.
This saves up to 4 pointer operands per DISubprogram. (I tried
measuring the effect on peak memory usage on an LTO link of an X86
llc, but the results were very noisy).
This reapplies r301498 with an attempted workaround for g++.
Differential Revision: https://reviews.llvm.org/D32560
llvm-svn: 301501
DISubprogram currently has 10 pointer operands, several of which are
often nullptr. This patch reduces the amount of memory allocated by
DISubprogram by rearranging the operands such that containing type,
template params, and thrown types come last, and are only allocated
when they are non-null (or followed by non-null operands).
This patch also eliminates the entirely unused DisplayName operand.
This saves up to 4 pointer operands per DISubprogram. (I tried
measuring the effect on peak memory usage on an LTO link of an X86
llc, but the results were very noisy).
llvm-svn: 301498
It was doing the same as the base implementation and was irritating me
when I was searching for backends that have custom behavior for
canRealignStack.
llvm-svn: 301495
For Swift we would like to be able to encode the error types that a
function may throw, so the debugger can display them alongside the
function's return value when finish-ing a function.
DWARF defines DW_TAG_thrown_type (intended to be used for C++ throw()
declarations) that is a perfect fit for this purpose. This patch wires
up support for DW_TAG_thrown_type in LLVM by adding a list of thrown
types to DISubprogram.
To offset the cost of the extra pointer, there is a follow-up patch
that turns DISubprogram into a variable-length node.
rdar://problem/29481673
Differential Revision: https://reviews.llvm.org/D32559
llvm-svn: 301489
The previous algorithm processed one character at a time, which is very
painful on a modern CPU. Replace it with xxHash64, which both already
exists in the codebase and is fairly fast.
Patch from Scott Smith!
Differential Revision: https://reviews.llvm.org/D32509
llvm-svn: 301487
Besides better codegen, the motivation is to be able to canonicalize this pattern
in IR (currently we don't) knowing that the backend is prepared for that.
This may also allow removing code for special constant cases in
DAGCombiner::foldSelectOfConstants() that was added in D30180.
Differential Revision: https://reviews.llvm.org/D31944
llvm-svn: 301457
Summary of changes:
- corrected vmcnt, expcnt, lgkmcnt helpers to checks their argument for truncation;
- added saturated versions of these helpers.
See bug 32711 for details: https://bugs.llvm.org//show_bug.cgi?id=32711
Reviewers: artem.tamazov, vpykhtin
Differential Revision: https://reviews.llvm.org/D32546
llvm-svn: 301439
Marking them as used causes them to be considered visible outside of LTO. This
prevents the symbols from being internalized or discarded, either by GlobalDCE
or by summary-based dead stripping in ThinLTO.
This change makes it unnecessary to add these symbols to llvm.compiler.used
in the backend, as the symbols are kept alive by virtue of being external,
so remove the backend code that handles that.
Fixes PR32798.
Differential Revision: https://reviews.llvm.org/D32544
llvm-svn: 301438
This patch introduces a new KnownBits struct that wraps the two APInt used by computeKnownBits. This allows us to treat them as more of a unit.
Initially I've just altered the signatures of computeKnownBits and InstCombine's simplifyDemandedBits to pass a KnownBits reference instead of two separate APInt references. I'll do similar to the SelectionDAG version of computeKnownBits/simplifyDemandedBits as a separate patch.
I've added a constructor that allows initializing both APInts to the same bit width with a starting value of 0. This reduces the repeated pattern of initializing both APInts. Once place default constructed the APInts so I added a default constructor for those cases.
Going forward I would like to add more methods that will work on the pairs. For example trunc, zext, and sext occur on both APInts together in several places. We should probably add a clear method that can be used to clear both pieces. Maybe a method to check for conflicting information. A method to return (Zero|One) so we don't write it out everywhere. Maybe a method for (Zero|One).isAllOnesValue() to determine if all bits are known. I'm sure there are many other methods we can come up with.
Differential Revision: https://reviews.llvm.org/D32376
llvm-svn: 301432
Commits were:
"Use WeakVH instead of WeakTrackingVH in AliasSetTracker's UnkownInsts"
"Add a new WeakVH value handle; NFC"
"Rename WeakVH to WeakTrackingVH; NFC"
The changes assumed pointers are 8 byte aligned on all architectures.
llvm-svn: 301429
Summary:
In cases where an instruction (a call site, say) is RAUW'ed with some
other value (this is possible via the `returned` attribute, amongst
other things), we want the slot in UnknownInsts to point to the
original Instruction we wanted to track, not the value it got replaced
by.
Fixes PR32587.
Reviewers: davide
Subscribers: mcrosier, llvm-commits
Differential Revision: https://reviews.llvm.org/D32268
llvm-svn: 301426
Summary:
WeakVH nulls itself out if the value it was tracking gets deleted, but
it does not track RAUW.
Reviewers: dblaikie, davide
Subscribers: mcrosier, llvm-commits
Differential Revision: https://reviews.llvm.org/D32267
llvm-svn: 301425
Summary:
I plan to use WeakVH to mean "nulls itself out on deletion, but does
not track RAUW" in a subsequent commit.
Reviewers: dblaikie, davide
Reviewed By: davide
Subscribers: arsenm, mehdi_amini, mcrosier, mzolotukhin, jfb, llvm-commits, nhaehnle
Differential Revision: https://reviews.llvm.org/D32266
llvm-svn: 301424
The SampleProfWriter emits function information in an order determined
by the string hash function. The situation is a bit brittle, because
changing the hash function can break the tests.
Instead of sorting the function samples to get a relaible ordering (that
might be too expensive), make the tests not depend on a particular
ordering of function samples.
Differential Revision: https://reviews.llvm.org/D32516
llvm-svn: 301419
Build vectors have magical truncation powers, so we have things like this:
v4i1 = BUILD_VECTOR Constant:i32<1>, Constant:i32<1>, Constant:i32<1>, Constant:i32<1>
v4i16 = BUILD_VECTOR Constant:i32<1>, Constant:i32<1>, Constant:i32<1>, Constant:i32<1>
If we don't truncate the splat node returned by getConstantSplatNode(), then we won't find
truth when ZeroOrNegativeOneBooleanContent is the rule.
Differential Revision: https://reviews.llvm.org/D32505
llvm-svn: 301408
For targets that don't have ISD::MULHS or ISD::SMUL_LOHI for the type
and the double width type is illegal, then the two operands are
sign extended to twice their size then multiplied to check for overflow.
The extended upper halves were mismatched causing an incorrect result.
This fixes the mismatch.
A test was added for ARM V6-M where the bug was detected.
Patch by James Duley.
Differential Revision: https://reviews.llvm.org/D31807
llvm-svn: 301404
Summary:
Otherwise we might end up with some empty basic blocks or
single-entry-single-exit basic blocks.
This fixes PR32085
Reviewers: chandlerc, danielcdh
Subscribers: mehdi_amini, RKSimon, llvm-commits
Differential Revision: https://reviews.llvm.org/D30468
llvm-svn: 301395
Removed micro mips register classes for gp initialization because gp initialization uses pure mips64 instruction. Even when compiling for micro mips, gp initialization can be done with pure mips64 instructions.
Reviewed by Simon Dardis
Differential: D32286
llvm-svn: 301394
r299766 contained a "conditional move or jump depends on uninitialized value"
fault, identified by valgrind. This occurred as MipsFastISel::finishCall(..)
used CCState over MipsCCState. The latter is required for the TableGen'd calling
convention logic due to reliance on pre-analyzing type information to lower call
results/returns of vectors correctly.
This change modifies the MipsCC AnalyzeCallResult to be useful with both the
SelectionDAG and FastISel lowering logic.
Reviewers: slthakur
Differential Revision: https://reviews.llvm.org/D32004
llvm-svn: 301392
Summary:
Expose the internal query structure, start using it.
Note: This is the most minimal change possible i could create. I have
trivial followups, like fixing the one use of const FastMathFlags &,
the renaming of CtxI to be consistent, etc.
This should be NFC.
Reviewers: majnemer, davide
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D32448
llvm-svn: 301379
If Select pseudo instruction doesn't have use SR, then
CMP instructions are being marked as dead and later can be
removed by MachineCSE pass. This leads to incorrect code
generation.
Differential Revision: https://reviews.llvm.org/D32473
llvm-svn: 301372
The order in which GCOV file info is printed depends on the string hash
function. This makes some GCOV tests brittle, because the tests must be
updated whenever the hash function changes.
Sort the filenames before printing out the file info to solve the
problem. This should be relatively cheap.
Differential Revision: https://reviews.llvm.org/D32512
llvm-svn: 301371
Summary:
Addends are used as offsets to addresses of globals
and can be both positive and negative. This change
prints libObject in line with the spec and the MC
layer.
Subscribers: jfb, dschuff
Differential Revision: https://reviews.llvm.org/D32507
llvm-svn: 301369
We were already parsing and dumping this to the human readable
format, but not to the YAML format. This does so, in preparation
for reading it in and reconstructing the line information from
YAML.
llvm-svn: 301357
We already have a function toHex that will convert a string like
"\xFF\xFF" to the string "FFFF", but we do not have one that goes
the other way - i.e. to convert a textual string representing a
sequence of hexadecimal characters into the corresponding actual
bytes. This patch adds such a function.
llvm-svn: 301356
This may trigger a segfault in llvm-objdump when the line number stored
in debug infromation points beyond the end of file; lines in LineBuffer
are stored in std::vector which is allocated in chunks, so even if the
debug info points beyond the end of the file, this doesn't necessarily
trigger the segfault unless the line number points beyond the allocated
space.
Differential Revision: https://reviews.llvm.org/D32466
llvm-svn: 301347
This patch is part of D28975's breakdown.
induction.ll encodes the specific (and rather arbitrary) numbers given to
predicated basic blocks by the unique naming mechanism, which makes it
sensitive to changes in LV's instruction generation order. This patch replaces
those specific numbers with a numeric pattern.
Differential Revision: https://reviews.llvm.org/D32404
llvm-svn: 301345
The code I've removed here exists in ExpandBinOp in InstSimplify which we call into before SimplifyUsingDistributiveLaws. The code in InstSimplify looks to have been copied from here.
I verified this code doesn't fire on any lit tests. Not that that proves its definitely dead.
Differential Revision: https://reviews.llvm.org/D32472
llvm-svn: 301341
This patch uses various APInt methods to reduce temporary APInt creation.
This should be all of the unrelated cleanups that got buried in D32376(creating a KnownBits struct) as well as some pointed out by Simon during the review of that. Plus a few improvements to use counting instead of masking.
I've left out any places where we do something like (KnownZero & KnownOne) != 0 as I plan to add a helper method to KnownBits to ask that question and didn't want to thrash that code an additional time.
Differential Revision: https://reviews.llvm.org/D32495
llvm-svn: 301338
The code Sanjay Patel moved over from InstCombine doesn't work properly if the 'and' has both inputs as nots because we used a commuted op matcher on the 'and' first. But this will bind to the first 'not' on 'and' when there could be two 'not's. InstCombine could rely on DeMorgan to ensure the 'and' wouldn't have two 'not's eventually, but InstSimplify can't rely on that.
This patch matches the xor first then checks for the ands and allows a not of either operand of the xor.
Differential Revision: https://reviews.llvm.org/D32458
llvm-svn: 301329
This is a pre-commit for a patch I'm working on to turn KnownZero/One into a struct. Once I do that the type here will be less obvious.
llvm-svn: 301324
This is a pre-commit for a patch that I'm working on to merge KnownZero/KnownOne into a KnownBits struct which would have had to touch this line.
llvm-svn: 301323
This patch reapplies r301309 with the fix to the MIR test to fix the assertion
triggered by r301309. Had trimmed a little bit too much from the MIR!
llvm-svn: 301317
The matching here wasn't able to handle all the possible commutes. It always assumed the not would be on the left of the xor, but that's not guaranteed.
Differential Revision: https://reviews.llvm.org/D32474
llvm-svn: 301316
Summary: No test case since I'm not aware of an in-tree target that needs this.
Reviewers: hans
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D32398
llvm-svn: 301311
This patch fixes a bug with the updating of DBG_VALUE's in
BreakAntiDependencies. Previously, it would only attempt to update the first
DBG_VALUE following the instruction whose register is being changed,
potentially leaving DBG_VALUE's referring to the wrong register. Now the code
will update all DBG_VALUE's that immediately follow the instruction.
This issue was detected as a result of an optimized codegen difference with
"-g" where an X86 byte/word fixup was not performed due to a DBG_VALUE
referencing the wrong register.
Differential Revision: https://reviews.llvm.org/D31755
llvm-svn: 301309
One of the fast-math optimizations is to replace calls to standard double
functions with their float equivalents, e.g. exp -> expf. However, this can
cause infinite loops for the following:
float expf(float val) { return (float) exp((double) val); }
A similar inline declaration exists in the MinGW-w64 math.h header file which
when compiled with -O2/3 and fast-math generates infinite loops.
So this fix checks that the calling function to the standard double function
that is being replaced does not match the float equivalent.
Differential Revision: https://reviews.llvm.org/D31806
llvm-svn: 301304
Summary:
In a previous change I changed SCEV's normalization / denormalization
to work with non-affine add recs. So the bailout in IVUsers can be
removed.
Reviewers: atrick, efriedma
Reviewed By: atrick
Subscribers: davide, mcrosier, llvm-commits
Differential Revision: https://reviews.llvm.org/D32105
llvm-svn: 301298
This patch is part of D28975's breakdown.
Genreating the control-flow to guard predicated instructions modified to
only use SplitBlockAndInsertIfThen() for producing the if-then construct.
Differential Revision: https://reviews.llvm.org/D32224
llvm-svn: 301293
We need to do this to prevent a miscompile which sinks an objc_retain
past an objc_release that releases the object objc_retain retains. This
happens because the top-down and bottom-up traversals each determines
the insert point for retain or release individually without knowing
where the other instruction is moved.
For example, when the following IR is fed to the ARC optimizer, the
top-down traversal decides to insert objc_retain right before
objc_release and the bottom-up traversal decides to insert objc_release
right after clang.arc.use.
(IR before ARC optimizer)
%11 = call i8* @objc_retain(i8* %10)
call void (...) @clang.arc.use(%0* %5)
call void @llvm.dbg.value(...)
call void @objc_release(i8* %6)
This reverses the order of objc_release and objc_retain, which causes
the object to be destructed prematurely.
(IR after ARC optimizer)
call void (...) @clang.arc.use(%0* %5)
call void @objc_release(i8* %6)
call void @llvm.dbg.value(...)
%11 = call i8* @objc_retain(i8* %10)
rdar://problem/30530580
llvm-svn: 301289
Summary:
Before this change, SCEV Normalization would incorrectly normalize
non-affine add recurrences. To work around this there was (still is)
a check in place to make sure we only tried to normalize affine add
recurrences.
We recently found a bug in aforementioned check to bail out of
normalizing non-affine add recurrences. However, instead of fixing
the bailout, I have decided to teach SCEV normalization to work
correctly with non-affine add recurrences, making the bailout
unnecessary (I'll remove it in a subsequent change).
I've also added some unit tests (which would have failed before this
change).
Reviewers: atrick, sunfish, efriedma
Reviewed By: atrick
Subscribers: mcrosier, mzolotukhin, llvm-commits
Differential Revision: https://reviews.llvm.org/D32104
llvm-svn: 301281
I'm proposing a fold for increment-of-sexted-bool in:
https://reviews.llvm.org/D31944
...so we need to know what happens in more cases like these.
llvm-svn: 301269
Remove the temporary, poorly named getSlotSet method which did the same
thing. Also remove getSlotNode, which is a hold-over from when we were
dealing with AttributeSetNode* instead of AttributeSet.
llvm-svn: 301267
Summary:
`git apply` on Windows doesn't work for files that SVN checks out as
CRLF. There is no way to force SVN to check everything out with Unix
line endings on Windows. Files with svn:eol-style=native will always
come out with CRLF, breaking `git apply`, which wants Unix line endings.
My workaround is to list all files with this property set in the change,
and run `dos2unix` on them. SVN doesn't commit a massive line ending
change because the svn:eol-style property indicates that these are text
files.
Tested on r301245.
Reviewers: zturner, jlebar
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D32452
llvm-svn: 301262
We can simplify (and (icmp X, C1), (icmp X, C2)) to one of the icmps in many cases.
I had to check some of these with Alive to prove to myself it's right, but everything
seems to check out. Eg, the code in instcombine was completely ignoring predicates with
mismatched signedness.
Handling or-of-icmps would be a follow-up step.
Differential Revision: https://reviews.llvm.org/D32143
llvm-svn: 301260
Rely on MachineRegisterInfo's knowledge of used physical
registers.
Move flat_scratch initialization earlier, so the uses are visible
when making these decisions.
This will make it easier to add another reserved register
at the end for the stack pointer rather than handling another
special case.
llvm-svn: 301254
Summary:
That API creates a temporary AttributeList to carry an index and a
single AttributeSet. We need to carry the index in addition to the set,
because that is how attribute groups are currently encoded.
NFC
Reviewers: pcc
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D32262
llvm-svn: 301245
Summary:
Ensure that the new merge BB (which contains the rest of the original BB
after the mem op being optimized) gets a profile frequency, in case
there are additional mem ops later in the BB. Otherwise they get skipped
as the merge BB looks cold.
Reviewers: davidxl, xur
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D32447
llvm-svn: 301244
This reverts commit r300732. This breaks a few tests.
I think the problem is related to adding more uses of
the condition that don't yet exist at this point.
llvm-svn: 301242
The current Loop Unroll implementation works with loops having a
single latch that contains a conditional branch to a block outside
the loop (the other successor is, by defition of latch, the header).
If this precondition doesn't hold, avoid unrolling the loop as
the code is not ready to handle such circumstances.
Differential Revision: https://reviews.llvm.org/D32261
llvm-svn: 301239
Use constant references rather than `const auto` which will cause the
copy constructor. These particular cases cause issues for the swift
compiler.
llvm-svn: 301237
libraries are properly unloaded when llvm_shutdown is called.
Summary:
This was mostly affecting usage of the JIT, where storing the library handles in
a set made iteration unordered/undefined. This lead to disagreement between the
JIT and native code as to what the address and implementation of particularly on
Windows with stdlib functions:
JIT: putenv_s("TEST", "VALUE") // called msvcrt.dll, putenv_s
JIT: getenv("TEST") -> "VALUE" // called msvcrt.dll, getenv
Native: getenv("TEST") -> NULL // called ucrt.dll, getenv
Also fixed is the issue of DynamicLibrary::getPermanentLibrary(0,0) on Windows
not giving priority to the process' symbols as it did on Unix.
Reviewers: chapuni, v.g.vassilev, lhames
Reviewed By: lhames
Subscribers: danalbert, srhines, mgorny, vsk, llvm-commits
Differential Revision: https://reviews.llvm.org/D30107
llvm-svn: 301236
In call sequence setups, there may not be a frame index base
and the pointer is a constant offset from the frame
pointer / scratch wave offset register.
llvm-svn: 301230
Merges equivalent initializations of M0 and hoists them into a common
dominator block. Technically the same code can be used with any
register, physical or virtual.
Differential Revision: https://reviews.llvm.org/D32279
llvm-svn: 301228
Summary:
llvm.invariant.group.barrier returns pointer that mustalias
pointer it takes. It can't be marked with `returned` attribute,
because it would be remove easily. The other reason is that
only Alias Analysis can know about this, because if any other
pass would know it, then the result would be replaced with it's
argument, which would be invalid.
We can think about returned pointer as something that mustalias, but
it doesn't have to be bitwise the same as the argument.
Reviewers: dberlin, chandlerc, hfinkel, sanjoy
Subscribers: reames, nlewycky, rsmith, anna, amharc
Differential Revision: https://reviews.llvm.org/D31585
llvm-svn: 301227
1. RegisterClass::getSize() is split into two functions:
- TargetRegisterInfo::getRegSizeInBits(const TargetRegisterClass &RC) const;
- TargetRegisterInfo::getSpillSize(const TargetRegisterClass &RC) const;
2. RegisterClass::getAlignment() is replaced by:
- TargetRegisterInfo::getSpillAlignment(const TargetRegisterClass &RC) const;
This will allow making those values depend on subtarget features in the
future.
Differential Revision: https://reviews.llvm.org/D31783
llvm-svn: 301221
Summary: In rL297945, jhenderson added methods for setting permissions
to sys::fs, but some of the unittests that attempt to set sticky bits
(01000) on files fail on modern BSDs, such as FreeBSD, NetBSD and
OpenBSD. This is because those systems do not allow regular users to
set sticky bits on files, only on directories. Fix it by disabling
these particular tests on modern BSDs.
Reviewers: emaste, brad, jhenderson
Reviewed By: jhenderson
Subscribers: joerg, krytarowski, llvm-commits
Differential Revision: https://reviews.llvm.org/D32120
llvm-svn: 301220
When functions are terminated by unreachable instructions, the last
instruction might trigger a CFI instruction to be generated. However,
emitting it would be be illegal since the function (and thus the FDE
the CFI is in) has already ended with the previous instruction.
Darwin's dwarfdump --verify --eh-frame complains about this and the
specification supports this.
Relevant bits from the DWARF 5 standard (6.4 Call Frame Information):
"[The] address_range [field in an FDE]: The number of bytes of
program instructions described by this entry."
"Row creation instructions: [...]
The new location value is always greater than the current one."
The first quotation implies that a CFI cannot describe a target
address outside of the enclosing FDE's range.
rdar://problem/26244988
Differential Revision: https://reviews.llvm.org/D32246
llvm-svn: 301219
Currently the operand type for ATOMIC_FENCE assumes value type of a pointer in address space 0.
This is fine for most targets. However for amdgcn target, the size of pointer in address space 0
depends on triple environment. For amdgiz environment, it is 64 bit but for other environment it is
32 bit. On the other hand, amdgcn target expects 32 bit fence operands independent of the target
triple environment. Therefore a hook is need in target lowering for getting the fence operand type.
This patch has no effect on targets other than amdgcn.
Differential Revision: https://reviews.llvm.org/D32186
llvm-svn: 301215
This is a straight cut and paste, but there's a bigger problem: if this
fold exists for simplifyOr, there should be a DeMorganized version for
simplifyAnd. But more than that, we have a patchwork of ad hoc logic
optimizations in InstCombine. There should be some structure to ensure
that we're not missing sibling folds across and/or/xor.
llvm-svn: 301213
Re-Commit of r300922 and r300923 with less aggressive assert (see
discussion at the end of https://reviews.llvm.org/D32205)
X86RegisterInfo::eliminateFrameIndex() and
X86FrameLowering::getFrameIndexReference() both had logic to compute the
base register. This consolidates the code.
Also use MachineInstr::isReturn instead of manually enumerating tail
call instructions (return instructions were not included in the previous
list because they never reference frame indexes).
Differential Revision: https://reviews.llvm.org/D32206
llvm-svn: 301211
When the location description of a source variable involves arithmetic
on the value itself, it needs to be marked with DW_OP_stack_value since it
is not describing the variable's location, but rather its value.
This is a follow-up to r297971 and fixes the source testcase quoted in
the comment in debuginfo-dce.ll.
rdar://problem/30725338
This reapplies r301093 without modifications.
llvm-svn: 301210
Fixes traps in any block besides the entry block,
and fixes depending on a live-in physical register
by using a virtual register copy.
Also happens to stop emitting a nop in the case
debug trap is not supported.
llvm-svn: 301206
The *real* difference between these two was that
a) The "graphical" dumper could recurse, while the text one could
not.
b) The "text" dumper could display nested types and functions,
while the graphical one could not.
Merge these two so that there is only one dumper that can recurse
arbitrarily deep and optionally display nested types or not.
llvm-svn: 301204
This reworks the way virtual bases are handled, and also the way
padding is detected across multiple levels of aggregates, producing
a much more accurate result.
llvm-svn: 301203
This replaces a hand written copy loop with a call to memcpy for both zext and sext.
For sext, it replaces multiple if/else blocks propagating sign information forward. Now we just do a copy, a sign extension on the last copied word, a memset, and clearUnusedBits.
Differential Revision: https://reviews.llvm.org/D32417
llvm-svn: 301201
There is logic to track the expected number of instructions
produced. It thought in this case an instruction would
be necessary to negate the result, but here it folded
into a ConstantExpr fneg when the non-undef value operand
was cancelled out by the second fsub.
I'm not sure why we don't fold constant FP ops with undef currently,
but I think that would also avoid this problem.
llvm-svn: 301199
This patch adds an in place version of ashr to match lshr and shl which were recently added.
I've tried to make this similar to the lshr code with additions to handle the sign extension. I've also tried to do this with less if checks than the current ashr code by sign extending the original result to a word boundary before doing any of the shifting. This removes a lot of the complexity of determining where to fill in sign bits after the shifting.
Differential Revision: https://reviews.llvm.org/D32415
llvm-svn: 301198
Summary:
Fix a compiler bug when the lane select happens to end up in a VGPR.
Clarify the semantic of the corresponding intrinsic to be that of
the corresponding GLSL: the lane select must be uniform across a
wave front, otherwise results are undefined.
Reviewers: arsenm
Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Differential Revision: https://reviews.llvm.org/D32343
llvm-svn: 301197
Summary:
Instead of keeping a variable indicating whether there are early exits
in the loop. We keep all the early exits. This improves LICM's ability to
move instructions out of the loop based on is-guaranteed-to-execute.
I am going to update compilation time as well soon.
Reviewers: hfinkel, sanjoy, efriedma, mkuper
Reviewed By: hfinkel
Subscribers: llvm-commits, mzolotukhin
Differential Revision: https://reviews.llvm.org/D32433
llvm-svn: 301196
Summary:
The return value of these intrinsics should always have 0 bits for
inactive threads. This means that when all arguments are constant
and the comparison evaluates to true, the intrinsic should return
the current exec mask.
Fixes some GL_ARB_shader_ballot tests.
Reviewers: arsenm
Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, llvm-commits, t-tye
Differential Revision: https://reviews.llvm.org/D32344
llvm-svn: 301195
While we use BaseIndexOffset in FindBetterNeighborChains to
appropriately realize they're almost the same address and should be
improved concurrently we do not use it in isAlias using the non-index
understanding FindBaseOffset instead. Adding a BaseIndexOffset check
in isAlias like should allow indexed stores to be merged.
FindBaseOffset to be excised in subsequent patch.
Reviewers: jyknight, aditya_nandakumar, bogner
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D31987
llvm-svn: 301187
Summary: Region objects capture the address of the creating RegionInfo instance. Because the RegionInfo class is movable, moving a RegionInfo object creates dangling references. This patch fixes these references by walking the Regions post-move, and updating references to the new parent.
Reviewers: Meinersbur, grosser
Reviewed By: Meinersbur, grosser
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D31719
llvm-svn: 301175
Summary: SUSE's ARM triples end with -gnueabi even though they are hard-float. This requires special handling of SUSE ARM triples. Hence we need a way to differentiate the SUSE as vendor. This CL adds that.
Reviewers: chandlerc, compnerd, echristo, rengolin
Reviewed By: rengolin
Subscribers: aemerson, rengolin, llvm-commits
Differential Revision: https://reviews.llvm.org/D32426
llvm-svn: 301174
I found this when investigated "Bug 32319 - .gdb_index is broken/incomplete" for LLD.
When we have object file with .debug_ranges section it may be filled with zeroes.
Relocations are exist in file to relocate this zeroes into real values later, but until that
a pair of zeroes is treated as terminator. And DWARF parser thinks there is no ranges at all
when I am trying to collect address ranges for building .gdb_index.
Solution implemented in this patch is to take relocations in account when parsing ranges.
Differential revision: https://reviews.llvm.org/D32228
llvm-svn: 301170
We have to widen the operands to 32 bits and then we can either use
hardware division if it is available or lower to a libcall otherwise.
At the moment it is not enough to set the Legalizer action to
WidenScalar, since for libcalls it won't know what to do (it won't be
able to find what size to widen to, because it will find Libcall and not
Legal for 32 bits). To hack around this limitation, we request Custom
lowering, and as part of that we widen first and then we run another
legalizeInstrStep on the widened DIV.
llvm-svn: 301166
Instruction isb takes as an operand either 'sy' or an immediate value. This
improves the diagnostic when the string is not 'sy' and adds a test case for
this which was missing. This also adds tests to check invalid inputs for dsb
and dmb.
Differential Revision: https://reviews.llvm.org/D32227
llvm-svn: 301165
Add support for both targets with hardware division and without. For
hardware division we have to add support throughout the pipeline
(legalizer, reg bank select, instruction select). For targets without
hardware division, we only need to mark it as a libcall.
llvm-svn: 301164
Treat them the same as the other binary operations that we have so far,
but on integers rather than floating point types. Extract the common
code into a helper.
This will be used in the ARM backend.
llvm-svn: 301163
When selecting a G_CONSTANT to a MOVi, we need the value to be an Imm
operand. We used to just leave the G_CONSTANT operand unchanged, which
works in some cases (such as the GEP offsets that we create when
referring to stack slots). However, in many other places the G_CONSTANTs
are created with CImm operands. This patch makes sure to handle those as
well, and to error out gracefully if in the end we don't end up with an
Imm operand.
Thanks to Oliver Stannard for reporting this issue.
llvm-svn: 301162
Summary:
This is a tool for comparing the function graphs produced by the
llvm-xray graph too. It takes the form of a new subcommand of the
llvm-xray tool 'graph-diff'.
This initial version of the patch is very rough, but it is close to
feature complete.
Depends on D29363
Reviewers: dblaikie, dberris
Reviewed By: dberris
Subscribers: mgorny, llvm-commits
Differential Revision: https://reviews.llvm.org/D29320
llvm-svn: 301160
Previously single word would always return 0 regardless of the original sign. Multi word would return all 0s or all 1s based on the original sign. Now single word takes into account the sign as well.
llvm-svn: 301159
The changes are causing the i686-mingw32 build to fail.
This reverts commit r301153, and the changes for a separate warning on i686-mingw32 in r301155 and r301156.
llvm-svn: 301157
libraries are properly unloaded when llvm_shutdown is called.
Summary:
This was mostly affecting usage of the JIT, where storing the library handles in
a set made iteration unordered/undefined. This lead to disagreement between the
JIT and native code as to what the address and implementation of particularly on
Windows with stdlib functions:
JIT: putenv_s("TEST", "VALUE") // called msvcrt.dll, putenv_s
JIT: getenv("TEST") -> "VALUE" // called msvcrt.dll, getenv
Native: getenv("TEST") -> NULL // called ucrt.dll, getenv
Also fixed is the issue of DynamicLibrary::getPermanentLibrary(0,0) on Windows
not giving priority to the process' symbols as it did on Unix.
Reviewers: chapuni, v.g.vassilev, lhames
Reviewed By: lhames
Subscribers: danalbert, srhines, mgorny, vsk, llvm-commits
Differential Revision: https://reviews.llvm.org/D30107
llvm-svn: 301153
This change reboots SCEV's current (off by default) verification logic
to avoid false failures. Instead of stringifying trip counts, it maps
old and new trip counts to the same ScalarEvolution "universe" and
asks ScalarEvolution to compute the difference between them. If the
difference comes out to be a non-zero constant, then (barring some
corner cases) we *know* we messed up.
I've not yet enabled this by default since it hits an exponential time
issue in SCEV, but once I fix that, I'll flip it on by default in
EXPENSIVE_CHECKS builds.
llvm-svn: 301146
We handled all of the commuted variants for plain xor already,
although they were scattered around and sometimes folded less
efficiently using distributive laws. We had no folds for not-xor.
Handling all of these patterns consistently is part of trying to
reinstate:
https://reviews.llvm.org/rL300977
llvm-svn: 301144
Summary:
In case all predecessor go to a single successor of current BB. We want to fold (not thread).
I failed to update the phi nodes properly in the last patch https://reviews.llvm.org/rL300657.
Phi nodes values are per predecessor in LLVM.
Reviewers: sanjoy
Reviewed By: sanjoy
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D32400
llvm-svn: 301139
This makes the WordBits calculation calculate a value between 1 and 64 for the number of bits in the last word. Previously if the BitWidth was a multiple of 64 bits the WordBits value was 0 and we had to bail out early to avoid an undefined shift. Now with a value of 64 we no longer have an undefined shift issue.
This shows a 15-16k reduction in the size of the opt binary on my local x86-64 build.
llvm-svn: 301134
The current code is trying to be clever with shifts to avoid needing to clear unused bits. But it looks like the compiler is unable to optimize out the unused bit handling in the APInt constructor. Given this its better to just use SignExtend64 and have more readable code.
llvm-svn: 301133