Summary:
For inalloca functions, this is a very common code pattern:
%argpack = type <{ i32, i32, i32 }>
define void @f(%argpack* inalloca %args) {
entry:
%a = getelementptr inbounds %argpack, %argpack* %args, i32 0, i32 0
%b = getelementptr inbounds %argpack, %argpack* %args, i32 0, i32 1
%c = getelementptr inbounds %argpack, %argpack* %args, i32 0, i32 2
tail call void @llvm.dbg.declare(metadata i32* %a, ... "a")
tail call void @llvm.dbg.declare(metadata i32* %c, ... "b")
tail call void @llvm.dbg.declare(metadata i32* %b, ... "c")
Even though these GEPs can be simplified to a constant offset from EBP
or RSP, we don't do that at -O0, and each GEP is computed into a
register. Registers used to compute argument addresses are typically
spilled and clobbered very quickly after the initial computation, so
live debug variable tracking loses information very quickly if we use
DBG_VALUE instructions.
This change moves processing of dbg.declare between argument lowering
and basic block isel, so that we can ask if an argument has a frame
index or not. If the argument lives in a register as is the case for
byval arguments on some targets, then we don't put it in the side table
and during ISel we emit DBG_VALUE instructions.
Reviewers: aprantl
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D32980
llvm-svn: 302483
As recently discussed on llvm-dev [1], this patch makes it illegal for
two Functions to point to the same DISubprogram and updates
FunctionCloner to also clone the debug info of a function to conform
to the new requirement. To simplify the implementation it also factors
out the creation of inlineAt locations from the Inliner into a
general-purpose utility in DILocation.
[1] http://lists.llvm.org/pipermail/llvm-dev/2017-May/112661.html
<rdar://problem/31926379>
Differential Revision: https://reviews.llvm.org/D32975
llvm-svn: 302469
This is another step towards getting rid of dyn_castNotVal,
so we can recommit:
https://reviews.llvm.org/rL300977
As the tests show, we were missing the lshr case for constants
and both ashr/lshr vector splat folds. The ashr case with constant
was being performed inefficiently in 2 steps. It's also possible
there was a latent bug in that case because we can't do that fold
if the constant is positive:
http://rise4fun.com/Alive/Bge
llvm-svn: 302465
Summary:
An llvm.dbg.declare of a static alloca is always added to the
MachineFunction dbg variable map, so these values are entirely
redundant. They survive all the way through codegen to be ignored by
DWARF emission.
Effectively revert r113967
Two bugpoint-reduced test cases from 2012 broke as a result of this
change. Despite my best efforts, I haven't been able to rewrite the test
case using dbg.value. I'm not too concerned about the lost coverage
because these were reduced from the test-suite, which we still run.
Reviewers: aprantl, dblaikie
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D32920
llvm-svn: 302461
Previously type visitation was done strictly sequentially, and
TypeIndexes were computed by incrementing the TypeIndex of the
last visited record. This works fine for situations like dumping,
but not when you want to visit types in random order. For example,
in a debug session someone might lookup a symbol by name, find that
it has TypeIndex 10,000 and then want to go straight to TypeIndex
10,000.
In order to make this work, the visitation framework needs a mode
where it can plumb TypeIndices through the callback pipeline. This
patch adds such a mode. In doing so, it is necessary to provide
an alternative implementation of TypeDatabase that supports random
access, so that is done as well.
Nothing actually uses these random access capabilities yet, but
this will be done in subsequent patches.
Differential Revision: https://reviews.llvm.org/D32928
llvm-svn: 302454
This fixes PR32550, in a way that does not imply running the greedy
mode at O0.
The fix consists in checking if a load is used by any floating point
instruction and if yes, we return a default mapping with FPR instead
of GPR.
llvm-svn: 302453
In r292478, we changed the order of the enum that is referenced by
PMI_FirstXXX. This had the side effect of changing the cost of the
mapping of all the loads, instead of just the FPRs ones.
Reinstate the higher cost for all but GPR loads.
Note: This did not have any external visible effects:
- For Fast mode, the cost would have been higher, but we don't care
because we don't try to use alternative mappings.
- For Greedy mode, the higher cost of the GPR loads, would have
triggered the use of the supposedly alternative mapping, that
would be in fact the same GPR mapping but with a lower cost.
llvm-svn: 302452
Statistic compile to always be 0 in release build so this compare would always return false. And in the debug builds Statistic are global variables and remember their values across pass runs. So this compare returns true anytime the pass runs after the first time it modifies something.
This was found after reviewing all usages of comparison operators on a Statistic object. We had some internal code that did a compare with a statistic that caused a mismatch in output between debug and release builds. So we did an audit out of paranoia.
llvm-svn: 302450
Transforms/IndVarSimplify/2011-10-27-lftrnull will fail if this regresses.
Transforms/GVN/PRE/2011-06-01-NonLocalMemdepMiscompile.ll has been changed to still test what it was
trying to test.
llvm-svn: 302446
This patch uses KnownOnes of the input of ctlz/cttz to bound the value that can be returned from these intrinsics. This makes these intrinsics more similar to the handling for ctpop which already uses known bits to produce a similar bound.
Differential Revision: https://reviews.llvm.org/D32521
llvm-svn: 302444
This introduces a new interface for computeKnownBits that returns the KnownBits object instead of requiring it to be pre-constructed and passed in by reference.
This is a much more convenient interface as it doesn't require the caller to figure out the BitWidth to pre-construct the object. It's so convenient that I believe we can use this interface to remove the special ComputeSignBit flavor of computeKnownBits.
As a step towards that idea, this patch replaces all of the internal usages of ComputeSignBit with this new interface. As you can see from the patch there were a couple places where we called ComputeSignBit which really called computeKnownBits, and then called computeKnownBits again directly. I've reduced those places to only making one call to computeKnownBits. I bet there are probably external users that do it too.
A future patch will update the external users and remove the ComputeSignBit interface. I'll also working on moving more locations to the KnownBits returning interface for computeKnownBits.
Differential Revision: https://reviews.llvm.org/D32848
llvm-svn: 302437
Summary:
Minor refactoring of foldIdentityShuffles() which allows the removal of a
ConstantDataVector::get() in SimplifyShuffleVectorInstruction.
Reviewers: spatel
Reviewed By: spatel
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D32955
Conflicts:
lib/Analysis/InstructionSimplify.cpp
llvm-svn: 302433
Currently combineLogicBlendIntoPBLENDV can only match ASHR to detect sign splatting of a bit mask, this patch generalises this to use computeNumSignBits instead.
This is a first step in several things we can do to improve PBLENDV support:
* Better matching of X86ISD::ANDNP patterns.
* Handle floating point cases.
* Better vector and bitcast support in computeNumSignBits.
* Recognise that PBLENDV only uses the sign bit of the mask, we should be able strip away sign splats (ASHR, PCMPGT isNeg tests etc.).
Differential Revision: https://reviews.llvm.org/D32953
llvm-svn: 302424
Summary:
Following up on Sanjay's suggetion in D32955, move this functionality
into ShuffleVectornstruction.
Reviewers: spatel, RKSimon
Reviewed By: RKSimon
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D32956
llvm-svn: 302420
Multiple ldr pseudoinstructions with the same constant value will
reuse the same constant pool entry. However, if the constant pool
is explicitly flushed with a .ltorg directive, we should not try
to reference constants in the previous pool any longer, since they
may be out of range.
This fixes assembling hand-written assembler source which repeatedly
loads the same constant value, across a binary size larger than the
pc-relative fixup range for ldr instructions (4096 bytes). Such
assembler source already uses explicit .ltorg instructions to emit
constant pools with regular intervals. However if we try to reuse
constants emitted in earlier pools, they end up out of range.
This makes the output of the testcase match what binutils gas does
(prior to this patch, it would fail to assemble).
Differential Revision: https://reviews.llvm.org/D32847
llvm-svn: 302416
This patch propogates the environment variable SYSTEMDRIVE on Windows when
running the unit tests. This prevents the creation of a directory named
"%SystemDrive%" when running the unit tests from FileSystemTest that use the
function llvm::sys::fs::remove_directories which in turn uses SHFileOperationW.
It is within SHFileOperationW that this environment variable may be used and if
undefined causes the creation of a "%SystemDrive%" directory in the current
directory.
Differential Revision: https://reviews.llvm.org/D32910
llvm-svn: 302409
The value of 'i' is always the smaller of DstParts and SrcParts so we can just use that fact to write all the code in terms of SrcParts and DstParts.
llvm-svn: 302408
This patch introduces an LLVM intrinsic and a target opcode for custom event
logging in XRay. Initially, its use case will be to allow users of XRay to log
some type of string ("poor man's printf"). The target opcode compiles to a noop
sled large enough to enable calling through to a runtime-determined relative
function call. At runtime, when X-Ray is enabled, the sled is replaced by
compiler-rt with a trampoline to the logic for creating the custom log entries.
Future patches will implement the compiler-rt parts and clang-side support for
emitting the IR corresponding to this intrinsic.
Reviewers: timshen, dberris
Subscribers: igorb, pelikan, rSerge, timshen, echristo, dberris, llvm-commits
Differential Revision: https://reviews.llvm.org/D27503
llvm-svn: 302405
Summary: Continue making updates to llvm-readobj to display resource sections. This is necessary for testing the up and coming cvtres tool.
Reviewers: zturner
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D32609
llvm-svn: 302399
Summary:
This reverts commit 56beec1b1cfc6d263e5eddb7efff06117c0724d2.
Revert "Quick fix to D32609, it seems .o files are not transferred in all cases."
This reverts commit 7652eecd29cfdeeab7f76f687586607a99ff4e36.
Revert "Update llvm-readobj -coff-resources to display tree structure."
This reverts commit 422b62c4d302cfc92401418c2acd165056081ed7.
Reviewers: zturner
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D32958
llvm-svn: 302397
Summary: Continue making updates to llvm-readobj to display resource sections. This is necessary for testing the up and coming cvtres tool.
Reviewers: zturner
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D32609
llvm-svn: 302386
Previously SimplifyCFG used getSetSize which returns an APInt that is 1 bit wider than the ConstantRange's bit width. In the reasonably common case that the ConstantRange is 64-bits wide, this requires returning a 65-bit APInt. APInt's can only store 64-bits without a memory allocation so this is inefficient.
The new method takes the 8 as an input and tells if the range contains more than that many elements without requiring any wider math.
llvm-svn: 302385
Account for subvector extraction/insertion, helps prevent the vectorizers from selecting 256-bit vectors that will have to be split anyhow on AVX1 targets.
llvm-svn: 302378
Summary:
Re-applying r301766 with a fix to a typo and a regression test.
The log message for r301766 was:
==================================================================================
InstructionSimplify: Canonicalize shuffle operands. NFC-ish.
Summary:
Apply canonicalization rules:
1. Input vectors with no elements selected from can be replaced with undef.
2. If only one input vector is constant it shall be the second one.
This allows constant-folding to cover more ad-hoc simplifications that
were in place and avoid duplication for RHS and LHS checks.
There are more rules we may want to add in the future when we see a
justification. e.g. mask elements that select undef elements can be
replaced with undef.
==================================================================================
Reviewers: spatel, RKSimon
Reviewed By: spatel
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D32863
llvm-svn: 302373
Currently llvm-rtdyld in -check mode will map sections to back-to-back 4k
aligned slabs starting at 0x1000. Automatically remapping sections by default is
helpful because it quickly exposes relocation bugs due to use of local addresses
rather than load addresses (these would silently pass if the load address was
not remapped). These mappings can be explicitly overridden on a per-section
basis using llvm-rtdlyd's -map-section option. This patch extends this scheme to
also preserve any mappings made by RuntimeDyld itself. Preserving RuntimeDyld's
automatic mappings allows us to write test cases to verify that these automatic
mappings have been applied.
This will allow the fix in https://reviews.llvm.org/D32899 to be tested with
llvm-rtdyld -check.
llvm-svn: 302372
Summary: This makes setRange take ConstantRange by rvalue reference since most callers were passing an unnamed temporary ConstantRange. We can then move that ConstantRange into the DenseMap caches. For the callers that weren't passing a temporary, I've added std::move to to the local variable being passed.
Reviewers: sanjoy, mzolotukhin, efriedma
Reviewed By: sanjoy
Subscribers: takuto.ikuta, llvm-commits
Differential Revision: https://reviews.llvm.org/D32943
llvm-svn: 302371
We can simplify (or (icmp X, C1), (icmp X, C2)) to 'true' or one of the icmps in many cases.
I had to check some of these with Alive to prove to myself it's right, but everything seems
to check out. Eg, the deleted code in instcombine was completely ignoring predicates with
mismatched signedness.
This is a follow-up to:
https://reviews.llvm.org/rL301260https://reviews.llvm.org/D32143
llvm-svn: 302370
The variable Proto is moved at the beginning of the codegen() function.
According to the comment above, the pointed object should be used due the
reference P.
Differential Revision: https://reviews.llvm.org/D32939
llvm-svn: 302369
rL294581 broke unnecessary register dependencies on partial v16i8/v8i16 BUILD_VECTORs, but on SSE41 we (currently) use insertion for full BUILD_VECTORs as well. By allowing full insertion to occur on SSE41 targets we can break register dependencies here as well.
llvm-svn: 302355
Summary:
When writing a loop pass I made a mistake and hit the assertion
"Unreachable block in loop". Later, I hit an assertion when I called
`BasicBlock::eraseFromParent()` incorrectly: "Use still stuck around
after Def is destroyed". This latter assertion, however, printed out
exactly which value is being deleted and what uses remain, which helped
me debug the issue.
To help people debugging their loop passes in the future, print out
exactly which basic block is unreachable in a loop.
Reviewers: sanjoy, hfinkel, mehdi_amini
Reviewed By: mehdi_amini
Subscribers: mzolotukhin
Differential Revision: https://reviews.llvm.org/D32878
llvm-svn: 302354
Remove an extra canonicalization step if ISD::ABS is going to be used anyway.
Updated x86 abs combine to check that we are lowering from both canonicalizations.
llvm-svn: 302337
This changes one parameter to be a const APInt& since we only read from it. Use std::move on local APInts once they are no longer needed so we can reuse their allocations. Lastly, use operator+=(uint64_t) instead of adding 1 to an APInt twice creating a new APInt each time.
llvm-svn: 302335
Summary:
ConstantRange contains two APInts which can allocate memory if their width is larger than 64-bits. So we shouldn't copy it when we can avoid it.
This changes LVILatticeVal::getConstantRange() to return its internal ConstantRange by reference. This allows many places that just need a ConstantRange reference to avoid making a copy.
Several places now capture the return value of getConstantRange() by reference so they can call methods on it that don't need a new object.
Lastly it adds std::move in one place to capture to move a local ConstantRange into an LVILatticeVal.
Reviewers: reames, dberlin, sanjoy, anna
Reviewed By: reames
Subscribers: grandinj, llvm-commits
Differential Revision: https://reviews.llvm.org/D32884
llvm-svn: 302331
This is a step toward having statically allocated instruciton mapping.
We are going to tablegen them eventually, so let us reflect that in
the API.
NFC.
llvm-svn: 302316