If the loaded type sizes don't match the element type of the sequential type, all bets are off and the addresses may, indeed, overlap.
Surprisingly, this just got caught in one test, on one builder, out of the 30+ builders testing this change. Congratulations go to http://lab.llvm.org:8011/builders/clang-aarch64-lnt/builds/5205.
llvm-svn: 251112
Also adds a 'trivial' ELF file. This was generated by assembling
and linking a file with the symbol main which contains a single
return instruction.
llvm-svn: 251096
Summary: Currently SimplifyResume can convert an invoke instruction to a call instruction if its landing pad is trivial. In practice we could have several invoke instructions with trivial landing pads and share a common rethrow block, and in the common rethrow block, all the landing pads join to a phi node. The patch extends SimplifyResume to check the phi of landing pad and their incoming blocks. If any of them is trivial, remove it from the phi node and convert the invoke instruction to a call instruction.
Reviewers: hfinkel, reames
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D13718
llvm-svn: 251061
The test case wasn't testing what it was commented to be testing; and
when I tried to fix the test I noticed that SCEV does not support the
simplification that the test was supposed to test.
This change removes the test case to avoid confusion.
llvm-svn: 251053
Summary:
An unsigned comparision is equivalent to is corresponding signed version
if both the operands being compared are positive. Teach SCEV to use
this fact when profitable.
Reviewers: atrick, hfinkel, reames, nlewycky
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D13687
llvm-svn: 251051
Summary:
- A s< (A + C)<nsw> if C > 0
- A s<= (A + C)<nsw> if C >= 0
- (A + C)<nsw> s< A if C < 0
- (A + C)<nsw> s<= A if C <= 0
Right now `C` needs to be a constant, but we can later generalize it to
be a non-constant if needed.
Reviewers: atrick, hfinkel, reames, nlewycky
Subscribers: sanjoy, llvm-commits
Differential Revision: http://reviews.llvm.org/D13686
llvm-svn: 251050
Summary:
This uses `ScalarEvolution::getRange` and not potentially control
dependent `nsw` and `nuw` bits on the arithmetic instruction.
Reviewers: atrick, hfinkel, nlewycky
Subscribers: llvm-commits, sanjoy
Differential Revision: http://reviews.llvm.org/D13613
llvm-svn: 251048
* Don't instrument promotable dynamic allocas:
We already have a test that checks that promotable dynamic allocas are
ignored, as well as static promotable allocas. Make sure this test will
still pass if/when we enable dynamic alloca instrumentation by default.
* Handle lifetime intrinsics before handling dynamic allocas:
lifetime intrinsics may refer to dynamic allocas, so we need to emit
instrumentation before these dynamic allocas would be replaced.
Differential Revision: http://reviews.llvm.org/D12704
llvm-svn: 251045
When we fold "mul ((add x, c1), c1)" -> "add ((mul x, c2), c1*c2)", we bail if (add x, c1) has multiple
users which would result in an extra add instruction.
In such cases, this patch adds a check to see if we can eliminate a multiply instruction in exchange for the extra add.
I also added the capability of doing the existing optimization with non-splatted vectors (splatted also works).
Differential Revision: http://reviews.llvm.org/D13740
llvm-svn: 251028
PR24686 identifies a problem where a relocation expression is invalid
when not all of the symbols in the expression can be locally
resolved. This causes the compiler to request a PC-relative half16ds
relocation, which is nonsensical for PowerPC. This patch recognizes
this situation and ensures we fail the assembly cleanly.
Test case provided by Anton Blanchard.
llvm-svn: 251027
Summary:
Hyphens were missing from the triple, causing it to be parsed
incorrectly. This patch updates the triple and makes necessary
changes to the expected output.
Patch is from Vinicius Tinti.
Reviewers: ab, tinti
Subscribers: srhines, llvm-commits
Differential Revision: http://reviews.llvm.org/D13792
llvm-svn: 251020
Instead of bailing out when we see loads, analyze them. If we can prove that the loaded-from address must escape, then we can conclude that a load from that address must escape too and therefore cannot alias a non-addr-taken global.
When checking if a Value can alias a non-addr-taken global, if the Value is a LoadInst of a non-global, recurse instead of bailing.
If we can follow a trail of loads up to some base that is captured, we know by inference that all the loads we followed are also captured.
llvm-svn: 251017
If the final indices of two GEPs can be proven to not be equal, and
the GEP is of a SequentialType (not a StructType), then the two GEPs
do not alias.
llvm-svn: 251016
isKnownNonEqual(A, B) returns true if it can be determined that A != B.
At the moment it only knows two facts, that a non-wrapping add of nonzero to a value cannot be that value:
A + B != A [where B != 0, addition is nsw or nuw]
and that contradictory known bits imply two values are not equal.
This patch also hooks this up to InstSimplify; InstSimplify had a peephole for the first fact but not the second so this teaches InstSimplify a new trick too (alas no measured performance impact!)
llvm-svn: 251012
Summary:
If a `CallSite` has operand bundles, then do not peek into the called
function to get a more precise `ModRef` answer.
This is tested using `argmemonly`, `-basicaa` and `-gvn`; but the
functionality is not specific to any of these.
Depends on D13961
Reviewers: reames, chandlerc
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D13962
llvm-svn: 250974
Summary:
This makes attribute accessors on `CallInst` and `InvokeInst` do the
(conservatively) right thing. This essentially involves, in some
cases, *not* falling back querying the attributes on the called
`llvm::Function` when operand bundles are present.
Attributes locally present on the `CallInst` or `InvokeInst` will still
override operand bundle semantics. The LangRef has been amended to
reflect this. Note: this change does not do anything prevent
`-function-attrs` from inferring `CallSite` local attributes after
inspecting the called function -- that will be done as a separate
change.
I've used `-adce` and `-early-cse` to test these changes. There is
nothing special about these passes (and they did not require any
changes) except that they seemed be the easiest way to write the tests.
This change does not add deal with `argmemonly`. That's a later change
because alias analysis requires a related fix before `argmemonly` can be
tested.
Reviewers: reames, chandlerc
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D13961
llvm-svn: 250973
These were the cause of a verifier error when building 7zip with
-verify-machineinstrs. Running 'make check' with the verifier
triggered the same error on the test here so i've updated the test
to run the verifier on one of its runs instead of adding a new one.
While looking at this code, there was a stale comment that these
instructions were only used for disassembly. This probably used to
be the case, but they are now used in the 'ARM load / store optimization pass' too.
This reapplies r242300 which was reverted in r242428 due to bot failures.
Ultimately those failures were spurious and completely unrelated to this commit. I reverted this
at the time because it was thought to be at fault.
llvm-svn: 250969
There may be other use operands that also need their kill flags cleared.
This happens in a few tests when SIFoldOperands is moved after
PeepholeOptimizer.
PeepholeOptimizer rewrites cases that look like:
%vreg0 = ...
%vreg1 = COPY %vreg0
use %vreg1<kill>
%vreg2 = COPY %vreg0
use %vreg2<kill>
to use the earlier source to
%vreg0 = ...
use %vreg0
use %vreg0
Currently SIFoldOperands sees the copied registers, so there is
only one use. So far I haven't managed to come up with a test
that currently has multiple uses of a foldable VGPR -> VGPR copy.
llvm-svn: 250960
Summary: ELF's STT_File symbols may overlap with regular globals in
other files, so we should ignore them here in order to avoid having
bogus entries in the symbol table that confuse us when resolving relocations.
Reviewers: lhames
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D13888
llvm-svn: 250942
SimplifyTerminatorOnSelect didn't consider the possibility that the
condition might be related to one of PHI nodes.
This fixes PR25267.
llvm-svn: 250922
in the size field in the archive header for the member is not a number. To do this we
have all of the needed methods return ErrorOr to push them up until we get out of lib.
Then the tools and can handle the error in whatever way is appropriate for that tool.
So the solution is to plumb all the ErrorOr stuff through everything that touches archives.
This include its iterators as one can create an Archive object but the first or any other
Child object may fail to be created due to a bad size field in its header.
Thanks to Lang Hames on the changes making child_iterator contain an
ErrorOr<Child> instead of a Child and the needed changes to ErrorOr.h to add
operator overloading for * and -> .
We don’t want to use llvm_unreachable() as it calls abort() and is produces a “crash”
and using report_fatal_error() to move the error checking will cause the program to
stop, neither of which are really correct in library code. There are still some uses of
these that should be cleaned up in this library code for other than the size field.
Also corrected the code where the size gets us to the “at the end of the archive”
which is OK but past the end of the archive will return object_error::parse_failed now.
The test cases use archives with text files so one can see the non-digit character,
in this case a ‘%’, in the size field.
llvm-svn: 250906
Summary:
Previously, we were inserting an InlineAsm statement for each line of the
inline assembly. This works for GAS but it triggers prologue/epilogue
emission when IAS is in use. This caused:
.set noreorder
.cpload $25
to be emitted as:
.set push
.set reorder
.set noreorder
.set pop
.set push
.set reorder
.cpload $25
.set pop
which led to assembler errors and caused the test to fail.
The whitespace-after-comma changes included in this patch are necessary to
match the output when IAS is in use.
Reviewers: vkalintiris
Subscribers: rkotler, llvm-commits, dsanders
Differential Revision: http://reviews.llvm.org/D13653
llvm-svn: 250895
When we have to convert the masked.load, masked.store to scalar code, we generate a chain of conditional basic blocks.
I added optimization for constant mask vector.
Differential Revision: http://reviews.llvm.org/D13855
llvm-svn: 250893
Summary:
The forwards compatibility strategy employed by MIPS is to consider registers
to be infinitely sign-extended. Then on ISA's with a wider register, the result
of existing instructions are sign-extended to register width and zero-extended
counterparts are added. copy_u.w on MSA32 and copy_u.w on MSA64 violate this
strategy and we have therefore corrected the MSA specs to fix this.
We still keep track of sign/zero-extension during legalization but we now
match copy_s.[wd] where required.
No change required to clang since __builtin_msa_copy_u_[wd] will map to
copy_s.[wd] where appropriate for the target.
Reviewers: vkalintiris
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D13472
llvm-svn: 250887
A mem-to-mem instruction (that both loads and stores), which store to an
FI, cannot pass the verifier since it thinks it is loading from the FI.
For the mem-to-mem instruction, do a looser check in visitMachineOperand()
and only check liveness at the reg-slot while analyzing a frame index operand.
Needed to make CodeGen/SystemZ/xor-01.ll pass with -verify-machineinstrs,
which now runs with this flag.
Reviewed by Evan Cheng and Quentin Colombet.
llvm-svn: 250885
Do not tail duplicate blocks where the successor has a phi node,
and the corresponding value in that phi node uses a subregister.
http://reviews.llvm.org/D13922
llvm-svn: 250877
C/C++ code can declare an extern function, which will show up as an import in WebAssembly's output. It's expected that the linker will resolve these, and mark unresolved imports as call_import (I have a patch which does this in wasmate).
llvm-svn: 250875
In some cases (as illustrated in the unittest), lineno can be less than the heade_lineno because the function body are included from some other files. In this case, offset will be negative. This patch makes clang still able to match the profile to IR in this situation.
http://reviews.llvm.org/D13914
llvm-svn: 250873
Summary:
TargetLoweringBase::Expand is defined as "Try to expand this to other ops,
otherwise use a libcall." For ISD::UDIV and ISD::SDIV, the choice between
the two possibilities was defined in a rather convoluted way:
- if DIVREM is legal, expand to DIVREM
- if DIVREM has a custom lowering, expand to DIVREM
- if DIVREM libcall is defined and a remainder from the same division is
computed elsewhere, expand to a DIVREM libcall
- else, expand to a DIV libcall
This had the undesirable effect that if both DIV and DIVREM are implemented
as libcalls, then ISD::UDIV and ISD::SDIV are expanded to the heavier DIVREM
libcall, even when the remainder isn't used.
The new code adds a new LegalizeAction, TargetLoweringBase::LibCall, so that
backends can directly control whether they prefer an expansion or a conversion
to a libcall. This makes the generic lowering code even more generic,
allowing its reuse in a wider range of target-specific configurations.
The useful effect is that ARM backend will now generate a call
to __aeabi_{i,u}div rather than __aeabi_{i,u}divmod in cases where
it doesn't need the remainder. There's no functional change outside
the ARM backend.
Reviewers: t.p.northover, rengolin
Subscribers: t.p.northover, llvm-commits, aemerson
Differential Revision: http://reviews.llvm.org/D13862
llvm-svn: 250826
The mask value type for maskload/maskstore GCC builtins is never a vector of
packed floats/doubles.
This patch fixes the following issues:
1. The mask argument for builtin_ia32_maskloadpd and builtin_ia32_maskstorepd
should be of type llvm_v2i64_ty and not llvm_v2f64_ty.
2. The mask argument for builtin_ia32_maskloadpd256 and
builtin_ia32_maskstorepd256 should be of type llvm_v4i64_ty and not
llvm_v4f64_ty.
3. The mask argument for builtin_ia32_maskloadps and builtin_ia32_maskstoreps
should be of type llvm_v4i32_ty and not llvm_v4f32_ty.
4. The mask argument for builtin_ia32_maskloadps256 and
builtin_ia32_maskstoreps256 should be of type llvm_v8i32_ty and not
llvm_v8f32_ty.
Differential Revision: http://reviews.llvm.org/D13776
llvm-svn: 250817
This wasn't doing anything useful. They weren't explicitly used
anywhere, and the RegScavenger ignores reserved registers.
This for some reason caused a random scheduling change in the test.
Getting the check lines to pass is too frustrating, and there's probably
not too much value in checking the vector case's operands N times.
llvm-svn: 250794