The debug message was pretty confusing here. It only reported the
situation with memchecks without the result of the dependence analysis.
Now it prints whether the loop is safe from the POV of the dependence
analysis and if yes, whether we need memchecks.
llvm-svn: 231854
Summary: This change leverages the cross-compiling functionality in the build system to build a release tablegen executable for use during the build.
Reviewers: resistor, rnk
Reviewed By: rnk
Subscribers: rnk, joker.eph, llvm-commits
Differential Revision: http://reviews.llvm.org/D7349
llvm-svn: 231842
This adds new node types for each intrinsic.
For instance, for addv, we have AArch64ISD::UADDV, such that:
(v4i32 (uaddv ...))
is the same as
(v4i32 (scalar_to_vector (i32 (int_aarch64_neon_uaddv ...))))
that is,
(v4i32 (INSERT_SUBREG (v4i32 (IMPLICIT_DEF)),
(i32 (int_aarch64_neon_uaddv ...)), ssub)
In a combine, we transform all such across-vector-lanes intrinsics to:
(i32 (extract_vector_elt (uaddv ...), 0))
This has one big advantage: by making the extract_element explicit, we
enable the existing patterns for lane-aware instructions to fire.
This lets us avoid needlessly going through the GPRs. Consider:
uint32x4_t test_mul(uint32x4_t a, uint32x4_t b) {
return vmulq_n_u32(a, vaddvq_u32(b));
}
We now generate:
addv.4s s1, v1
mul.4s v0, v0, v1[0]
instead of the previous:
addv.4s s1, v1
fmov w8, s1
dup.4s v1, w8
mul.4s v0, v1, v0
rdar://20044838
llvm-svn: 231840
Summary: This patch fixes a bug in `readEncodedPointer()` where it would read from memory that was not suitably aligned. This patch fixes it by using memcpy.
Reviewers: danalbert, echristo, compnerd, mclow.lists
Reviewed By: compnerd, mclow.lists
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D8179
llvm-svn: 231839
Most are redundant, and they never seem to fire.
The V128 integer patterns already exist in the INS multiclass.
The duplicates only fire when the vector index type isn't i64,
because they accept "imm" instead of an explicit "i64", as the
instruction definition patterns do.
TLI::getVectorIdxTy is i64 on AArch64, so this should never happen.
Also, one of them had a typo: for i64, INSvi32lane was used.
I noticed because I mistakenly used an explicit i32 as the idx type,
and got ins.s for an i64 vector_insert.
The V64 patterns also don't seem to ever fire, as V64 vector
extract/insert are legalized to V128.
The equivalent float patterns are unique and useful, so keep them.
No functional change intended; none exhibited on the LIT and LNT tests.
llvm-svn: 231838
Follow up from r231505.
Fix the non-determinism by using a MapVector and reintroduce the AArch64
testcase. Defer deleting the got candidates up to the end and remove
them in a bulk, avoiding linear time removal of each element.
Thanks to Renato Golin for trying it out on other platforms.
llvm-svn: 231830
Because the catchable type has a reference to its name, mangle the
location to ensure that two catchable types with different locations are
distinct.
llvm-svn: 231819
This is the final patch that actually introduces the new parameter of
partition mapping to RuntimePointerCheck::needsChecking.
Another API (LAI::getInstructionsForAccess) is also exposed that helps
to map pointers to instructions because ultimately we partition
instructions.
The WIP version of the Loop Distribution pass in D6930 has been adapted
to use all this. See for example, how
InstrPartitionContainer::computePartitionSetForPointers sets up the
partitions using the above API and then calls to LAI::addRuntimeCheck
with the pointer partitions.
llvm-svn: 231818
Now the analysis won't "fail" if the memchecks exceed the threshold. It
is the transform pass' responsibility to perform the check.
This allows the transform pass to further analyze/eliminate the
memchecks. E.g. in Loop distribution we only need to check pointers
that end up in different partitions.
Note that there is a slight change of functionality here. The logic in
analyzeLoop is that if dependence checking fails due to non-constant
distance between the pointers, another attempt is made to prove safety
of the dependences purely using run-time checks.
Before this patch we could fail the loop due to exceeding the memcheck
threshold after the first step, now we only check the threshold in the
client after the full analysis. There is no measurable compile-time
effect but I wanted to record this here.
llvm-svn: 231817
The check for the number of memchecks will be moved to the client of
this analysis. Besides allowing for transform-specific thresholds, this
also lets Loop Distribution post-process the memchecks; Loop
Distribution only needs memchecks between pointers of different
partitions.
The motivation for this first patch is to untangle the CanDoRT check
from the NumComparison check before moving the NumComparison part.
CanDoRT means that we couldn't determine the bounds for the pointer.
Note that NumComparison is set independent of this flag.
llvm-svn: 231816
If anyone is using this for some strange reason,
LLVMInitializeNVPTXAsmPrinter does exactly the same thing and is what
other LLVM tools are calling.
llvm-svn: 231810
After http://reviews.llvm.org/D8133 landed as r231550 process launch on remote platform stopped working.
This adds Debugger::InitializeForLLGS and tracks whether one or both of Initialize and InitializeForLLGS have been called, calling only the corresponding lldb_private::Terminate* methods as necessary. Since lldb_private::Terminate calls lldb_private::TerminateForLLGS, the latter method may be called twice if Initialize was called for both however the terminate methods ensure they are only called once after being initialized.
This still maintains the reduced binary size, though it does now technically link in lldb_private::Terminate on lldb-server even though this should never be called.
This should resolve the issue raised in http://reviews.llvm.org/D8133 where Debugger::Terminate assumed that there were 0 references to debugger and terminated early.
Differential Revision: http://reviews.llvm.org/D8183
llvm-svn: 231808
The dependences are now expose through the new getInterestingDependences
API so we can use that with -analyze too and fix the FIXME.
This lets us remove the test that relied on -debug to check the
dependences.
llvm-svn: 231807
Gather an array of interesting dependences rather than just failing
after the first unsafe one and regarding the loop unsafe. Loop
Distribution needs to be able to collect all dependences in order to
isolate the dependence cycles into their own partition.
Since the dependence checking algorithm is quadratic in terms of
accesses sharing the same underlying pointer, I am applying a cut-off
threshold (MaxInterestingDependence). Exceeding that, the logic reverts
back to the original approach deeming the loop unsafe upon encountering
the first unsafe dependence.
The main idea of the patch is to split isDepedent from directly
answering the question whether the dep is safe for vectorization to
return a dependence type which then gets mapped to old boolean result
using Dependence::isSafeForVectorization.
Tested that this was compile-time neutral on SpecINT2006 LTO bitcode
inputs. No assembly change on the testsuite including external.
llvm-svn: 231806
LoopDistribution needs to query various results of the dependence
analysis. This series will expose some more APIs and state of the
dependence checker.
This patch is a simple one to just expose the DepChecker instance. The
set is compile-time neutral measured with LTO bitcode files of
SpecINT2006. Also there is no assembly change on the testsuite.
llvm-svn: 231805
Summary:
Fix a comment for ValueObject::GetValueDidChange after r231526.
This fix was requested by @jingham.
Reviewers: jingham, ki.stfu
Reviewed By: ki.stfu
Subscribers: lldb-commits, jingham
Differential Revision: http://reviews.llvm.org/D8206
llvm-svn: 231804
We do not implicitly create an OpenCLImageAccessAttr, so this change only affects out of tree users. There is no way to test this behavior specifically that I can see, since this only affects implicit creation of attributes.
Fixes PR22403.
llvm-svn: 231803
This makes code that uses section relative expressions (debug info) simpler and
less brittle.
This is still a bit awkward as the symbol is created late and has to be
stored in a mutable field.
I will move the symbol creation earlier in the next patch.
llvm-svn: 231802