Previously, the index was constrained to the size of the memory operation for
no apparent reason. This change removes that constraint so that we can form
pre-index instructions with any valid offset.
llvm-svn: 248931
Same strategy as simplifyInstructionsInBlock. ~1/3 less time
on my test suite. This pass doesn't have many in-tree users,
but getting rid of an O(N^2) worst case and making it cleaner
should at least make it a viable alternative to ADCE, since
it's now consistently somewhat faster.
llvm-svn: 248927
We get into this bad state when someone defines a new member function
for a class but forgets to add the declaration to the class body.
Calling the new member function from a member function template of the
class will crash during instantiation.
llvm-svn: 248925
As Richard Barton observed at http://reviews.llvm.org/D12937#inline-107121
TargetParser in LLVM has insufficient support for ARMv6Z and ARMv6ZK.
In particular, there were no tests for TrustZone being supported in these
architectures.
The patch clears a FIXME: left by Saleem Abdulrasool in r201471, and fixes
his test case which hadn't really been testing what it was claiming to test.
Differential Revision: http://reviews.llvm.org/D13236
llvm-svn: 248921
This linker script parser and evaluator is powerful enough to read
Linux's libc.so, which is (despite its name) a linker script that
contains OUTPUT_FORMAT, GROUP and AS_NEEDED directives.
The parser implemented in this patch is a recursive-descendent one.
It does *not* construct an AST but consumes directives in place and
sets the results to Symtab object, like what Driver is doing.
This should be very fast since less objects are allocated, and
this is also more readable.
http://reviews.llvm.org/D13232
llvm-svn: 248918
Usually large blocks are not a problem. But if a large block (> 10k instructions)
contains many (potential) chains of vector instructions, and those chains are
spread over a wide range of instructions, then scheduling becomes a compile time problem.
This change introduces a limit for the accumulate scheduling region size of a block.
For real-world functions this limit will never be exceeded (it's about 10x larger than
the maximum value seen in the test-suite and external test suite).
llvm-svn: 248917
Because we handle more than SCEV does it is not possible to rewrite an
expression on the top-level using the SCEVParameterRewriter only. With
this patch we will do the rewriting on demand only and also
recursively, thus not only on the top-level.
llvm-svn: 248916
This patch teaches InstCombiner how to convert a SSSE3/AVX2 byte shuffle to a
builtin shuffle if the mask is constant.
Converting byte shuffle intrinsic calls to builtin shuffles can help finding
more opportunities for combining shuffles later on in selection dag.
We may end up with byte shuffles with constant masks as the result of inlining.
Differential Revision: http://reviews.llvm.org/D13252
llvm-svn: 248913
When building a plugin against an installed LLVM toolchain using
add_llvm_loadable_module (in the documented manner) doesn't work as nothing sets
the *_OUTPUT_INTDIR variables causing an error when set_output_directory is
called. Making those arguments optional (causing the default output directory
to be used) fixes this.
Differential Revision: http://reviews.llvm.org/D13215
llvm-svn: 248911
- Remove virtual SC_OpenCLWorkGroupLocal storage type specifier
as it conflicts with static local variables now and prevents
diagnosing static local address space variables correctly.
- Allow static local and global variables (OpenCL2.0 s6.8 and s6.5.1).
- Improve diagnostics of allowed ASes for variables in different scopes:
(i) Global or static local variables have to be in global
or constant ASes (OpenCL1.2 s6.5, OpenCL2.0 s6.5.1);
(ii) Non-kernel function variables can't be declared in local
or constant ASes (OpenCL1.1 s6.5.2 and s6.5.3).
http://reviews.llvm.org/D13105
llvm-svn: 248906
FunctionParmPackExpr actually stores an array of ParmVarDecl* (and
accessors return that). But, the FunctionParmPackExpr::Create()
constructor accepted an array of Decl *s instead.
It was easy for this mismatch to occur without any obvious sign of
something wrong, since both the store and the access used independent
'reinterpet_cast<XX>(this+1)' calls.
llvm-svn: 248905
.ARM.exidx/.ARM.extab sections contain unwind information used on ARM
architecture from unwinding from an exception.
Differential revision: http://reviews.llvm.org/D13245
llvm-svn: 248903
Instructions which we can synthesis from a SCEV expression are not
generated directly, but only when they are used as an operand of
another instruction. This avoids generating unnecessary instructions
and works more reliably than first inserting them and then deleting
them later on.
This commit was reverted in r248860 due to a remaining miscompile, where
we forgot to synthesis the operand values that were referenced from scalar
writes. test/Isl/CodeGen/scalar-store-from-same-bb.ll tests that we do this
now correctly.
llvm-svn: 248900
Summary:
As per Duncan's review for D12536, I extracted the sub-byte bit aligned
reading and writing code into lib/Support, and generalized it. Added calls from
BackpatchWord. Also added unittests.
Reviewers: dexonsmith
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D13189
llvm-svn: 248897
Applied restrictions from OpenCL v2.0 s6.13.11.8
that mainly disallow operations on atomic types (except for taking their address - &).
The patch is taken from SPIR2.0 provisional branch, contributed by Guy Benyei!
llvm-svn: 248896
This test was timing out because the test inferior was forking a child, which was not terminated
correctly. The test contained provisions to terminate this child, but these were no longer
working. The idea was to wake up upon receiving SIGTERM and then kill the child. However, this
was failing because the test first tried to use SIGHUP, which ended up killing the inferior.
Fix: make sure we catch SIGHUP also.
llvm-svn: 248889
This is the clang commit associated with llvm r248887.
This commit changes the interface of the vld[1234], vld[234]lane, and vst[1234],
vst[234]lane ARM neon intrinsics and associates an address space with the
pointer that these intrinsics take. This changes, e.g.,
<2 x i32> @llvm.arm.neon.vld1.v2i32(i8*, i32)
to
<2 x i32> @llvm.arm.neon.vld1.v2i32.p0i8(i8*, i32)
This change ensures that address spaces are fully taken into account in the ARM
target during lowering of interleaved loads and stores.
Differential Revision: http://reviews.llvm.org/D13127
llvm-svn: 248888
This commit changes the interface of the vld[1234], vld[234]lane, and vst[1234],
vst[234]lane ARM neon intrinsics and associates an address space with the
pointer that these intrinsics take. This changes, e.g.,
<2 x i32> @llvm.arm.neon.vld1.v2i32(i8*, i32)
to
<2 x i32> @llvm.arm.neon.vld1.v2i32.p0i8(i8*, i32)
This change ensures that address spaces are fully taken into account in the ARM
target during lowering of interleaved loads and stores.
Differential Revision: http://reviews.llvm.org/D12985
llvm-svn: 248887
When using LLVMConfig.cmake from an installed toolchain in order to build a
loadable pass using add_llvm_loadable_module LLVM_ENABLE_PLUGINS and
LLVM_PLUGIN_EXT must be set. Also make LLVM_DEFINITIONS be set to what it
actually is.
Differential Revision: http://reviews.llvm.org/D13214
llvm-svn: 248884
Currently most of the test files have a separate dwarf and a separate
dsym test with almost identical content (only the build step is
different). With adding dwo symbol file handling to the test suit it
would increase this to a 3-way duplication. The purpose of this change
is to eliminate this redundancy with generating 2 test case (one dwarf
and one dsym) for each test function specified (dwo handling will be
added at a later commit).
Main design goals:
* There should be no boilerplate code in each test file to support the
multiple debug info in most of the tests (custom scenarios are
acceptable in special cases) so adding a new test case is easier and
we can't miss one of the debug info type.
* In case of a test failure, the debug symbols used during the test run
have to be cleanly visible from the output of dotest.py to make
debugging easier both from build bot logs and from local test runs
* Each test case should have a unique, fully qualified name so we can
run exactly 1 test with "-f <test-case>.<test-function>" syntax
* Test output should be grouped based on test files the same way as it
happens now (displaying dwarf/dsym results separately isn't
preferable)
Proposed solution (main logic in lldbtest.py, rest of them are test
cases fixed up for the new style):
* Have only 1 test fuction in the test files what will run for all
debug info separately and this test function should call just
"self.build(...)" to build an inferior with the right debug info
* When a class is created by python (the class object, not the class
instance), we will generate a new test method for each debug info
format in the test class with the name "<test-function>_<debug-info>"
and remove the original test method. This way unittest2 see multiple
test methods (1 for each debug info, pretty much as of now) and will
handle the test selection and the failure reporting correctly (the
debug info will be visible from the end of the test name)
* Add new annotation @no_debug_info_test to disable the generation of
multiple tests for each debug info format when the test don't have an
inferior
Differential revision: http://reviews.llvm.org/D13028
llvm-svn: 248883
Before we unconditinoally forced all users outside the SCoP to use
the preloaded value. However, if the SCoP is not executed due to the
runtime checks, we need to use the original value because it might not
be invariant in the first place.
llvm-svn: 248881