Loads and stores can only shift the offset register by the size of the value
being loaded, but currently the DAGCombiner will reduce the width of the load
if it's followed by a trunc making it impossible to later combine the shift.
Solve this by implementing shouldReduceLoadWidth for the AArch64 backend and
make it prevent the width reduction if this is what would happen, though do
allow it if reducing the load width will let us eliminate a later sign or zero
extend.
Differential Revision: https://reviews.llvm.org/D44794
llvm-svn: 328321
This was due to a misunderstanding over what llvm calls a micro-op (retirement unit) is actually called a macro-op on the AMD/Jaguar target. Folded loads don't affect num macro ops.
llvm-svn: 328320
Summary:
This change is part of step six in the series of changes to remove the alignment
argument from memcpy/memmove/memset in favour of alignment attributes. At this
point all users of the IRBuilder APIs for creating a memcpy/memmove call given
a single value for alignment have been updated. We want to discourage usage of
these old APIs in favour of the newer ones that allow for separate source and
destination alignments, so this patch deletes the old API.
Specifically, we remove from IRBuilder:
CallInst *CreateMemCpy(Value *Dst, Value *Src, uint64_t Size, unsigned Align,
bool isVolatile = false, MDNode *TBAATag = nullptr,
MDNode *TBAAStructTag = nullptr,
MDNode *ScopeTag = nullptr,
MDNode *NoAliasTag = nullptr)
CallInst *CreateMemCpy(Value *Dst, Value *Src, Value *Size, unsigned Align,
bool isVolatile = false, MDNode *TBAATag = nullptr,
MDNode *TBAAStructTag = nullptr,
MDNode *ScopeTag = nullptr,
MDNode *NoAliasTag = nullptr)
CallInst *CreateMemMove(Value *Dst, Value *Src, uint64_t Size, unsigned Align,
bool isVolatile = false, MDNode *TBAATag = nullptr,
MDNode *ScopeTag = nullptr,
MDNode *NoAliasTag = nullptr)
CallInst *CreateMemMove(Value *Dst, Value *Src, Value *Size, unsigned Align,
bool isVolatile = false, MDNode *TBAATag = nullptr,
MDNode *ScopeTag = nullptr,
MDNode *NoAliasTag = nullptr)
Steps:
Step 1) Remove alignment parameter and create alignment parameter attributes for
memcpy/memmove/memset. ( rL322965, rC322964, rL322963 )
Step 2) Expand the IRBuilder API to allow creation of memcpy/memmove with differing
source and dest alignments. ( rL323597 )
Step 3) Update Clang to use the new IRBuilder API. ( rC323617 )
Step 4) Update Polly to use the new IRBuilder API. ( rL323618 )
Step 5) Update LLVM passes that create memcpy/memmove calls to use the new IRBuilder API,
and those that use use MemIntrinsicInst::[get|set]Alignment() to use [get|set]DestAlignment()
and [get|set]SourceAlignment() instead. ( rL323886, rL323891, rL324148, rL324273, rL324278,
rL324384, rL324395, rL324402, rL324626, rL324642, rL324653, rL324654, rL324773, rL324774,
rL324781, rL324784, rL324955, rL324960, rL325816, rL327398, rL327421, rL328097 )
Step 6) Remove the single-alignment IRBuilder API for memcpy/memmove, and the
MemIntrinsicInst::[get|set]Alignment() methods.
Reference
http://lists.llvm.org/pipermail/llvm-dev/2015-August/089384.htmlhttp://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html
llvm-svn: 328317
When building the SLP tree, we look for reuse among the vectorized tree
entries. However, each gather sequence is represented by a unique tree entry,
even though the sequence may be identical to another one. This means, for
example, that a gather sequence with two uses will be counted twice when
computing the cost of the tree. We should only count the cost of the definition
of a gather sequence rather than its uses. During code generation, the
redundant gather sequences are emitted, but we optimize them away with CSE. So
it looks like this problem just affects the cost model.
Differential Revision: https://reviews.llvm.org/D44742
llvm-svn: 328316
Summary:
This change is part of step six in the series of changes to remove
the alignment argument from memcpy/memmove/memset in favour of
alignment attributes. At this point all uses of
MemIntrinsicInst::[get|set]Alignment() have been updated, so we now
remove these methods entirely to discourage their use.
Steps:
Step 1) Remove alignment parameter and create alignment parameter attributes for
memcpy/memmove/memset. ( rL322965, rC322964, rL322963 )
Step 2) Expand the IRBuilder API to allow creation of memcpy/memmove with differing
source and dest alignments. ( rL323597 )
Step 3) Update Clang to use the new IRBuilder API. ( rC323617 )
Step 4) Update Polly to use the new IRBuilder API. ( rL323618 )
Step 5) Update LLVM passes that create memcpy/memmove calls to use the new IRBuilder API,
and those that use use MemIntrinsicInst::[get|set]Alignment() to use [get|set]DestAlignment()
and [get|set]SourceAlignment() instead. ( rL323886, rL323891, rL324148, rL324273, rL324278,
rL324384, rL324395, rL324402, rL324626, rL324642, rL324653, rL324654, rL324773, rL324774,
rL324781, rL324784, rL324955, rL324960, rL325816, rL327398, rL327421, rL328097 )
Step 6) Remove the single-alignment IRBuilder API for memcpy/memmove, and the
MemIntrinsicInst::[get|set]Alignment() methods.
Reference
http://lists.llvm.org/pipermail/llvm-dev/2015-August/089384.htmlhttp://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html
llvm-svn: 328315
Summary:
Some targets does not support labels inside debug sections, but support
references in form `section+offset`. Patch adds initial support
for this.
Reviewers: echristo, probinson, jlebar
Subscribers: llvm-commits, JDevlieghere
Differential Revision: https://reviews.llvm.org/D43943
llvm-svn: 328314
When targeting execute-only and fp-armv8, float constants in a compare
resulted in instruction selection failures. This is now fixed by using
vmov.f32 where possible, otherwise the floating point constant is
lowered into a integer constant that is moved into a floating point
register.
This patch also restores using fpcmp with immediate 0 under fp-armv8.
Change-Id: Ie87229706f4ed879a0c0cf66631b6047ed6c6443
llvm-svn: 328313
This was being masked because GISel is enabled by default for -O0 and
the abort was disabled. Modified test to explicitly enable abort.
llvm-svn: 328311
For comparisons with parameters, we can use the ParamState lattice
elements which also provide constant range information. This improves
the code for PR33253 further and gets us closer to use
ValueLatticeElement for all values.
Also, as we are using the range information in the solver directly, we
do not need tryToReplaceWithConstantRange afterwards anymore.
Reviewers: dberlin, mssimpso, davide, efriedma
Reviewed By: mssimpso
Differential Revision: https://reviews.llvm.org/D43762
llvm-svn: 328307
By default, the tool always enables the resource pressure view.
This flag lets user specify whether they want to add that view or not.
llvm-svn: 328305
Loop peeling also has an impact on the induction variables, so we should
benefit from induction variable simplification after peeling too.
Reviewers: sanjoy, bogner, mzolotukhin, efriedma
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D43878
llvm-svn: 328301
There's are race between this thread and the destructor of the test ORC
components on the main threads. I saw flaky failures there in about 4%
of the runs of this unit test.
llvm-svn: 328300
This fixes PR36367 which is about segfault when --emit-relocs is
used together with .eh_frame sections which happens because
of reordering of regular and .rel[a] sections.
Path changes loop that iterates over input sections to create
relocation target sections first.
Differential revision: https://reviews.llvm.org/D44679
llvm-svn: 328299
The VMOVMSKBrr was in a separate InstRW with a lower latency, but I assume they should be the same and the higher latency matches Agners table so I'm going with that.
llvm-svn: 328291
The issues was that we were setting hidden visibility if, when
processing a hidden class, we found out that we needed to emit a
reference to a vtable provided by the standard library.
Original message:
Set dso_local on vtables.
llvm-svn: 328288
I'm not sure /why/ this is causing issues for libclang, but it is.
Unbreak the buildbots since it's already consumed an hour of my time.
llvm-svn: 328286
Changes the analyzer to believe that methods annotated with _Nonnull
from system frameworks indeed return non null objects.
Local methods with such annotation are still distrusted.
rdar://24291919
Differential Revision: https://reviews.llvm.org/D44341
llvm-svn: 328282
Current location is very confusing, especially because there is already
WorkList.h, and other code in CoreEngine.cpp is not related to work list
implementation.
Differential Revision: https://reviews.llvm.org/D44759
llvm-svn: 328280
Putting back the code in commit r327189 that was reverted in r322737. The code is being committed in three stages and this one is the last stage: 1) r327455 fp16 feature flags, 2) r327836 pass half type or i16 based on FullFP16, and 3) the code here which the front-end fp16 vector intrinsic for ARM.
Differential revision https://reviews.llvm.org/D43650
llvm-svn: 328277