This patch reduces code size for all AVX targets and increases speed for some chips.
SSE 4.1 introduced the useless (see code comments) 2-register form of BLENDV and
only in the packed float/double flavors.
AVX subsequently made the instruction useful by adding a 4-register operand form.
So we just need to paper over the lack of scalar forms of this instruction, complicate
the code to choose float or double forms, and use blendv on scalars since all FP is in
xmm registers anyway.
This gives us an approximately 50% speed up for a blendv microbenchmark sequence
on SandyBridge and Haswell:
blendv : 29.73 cycles/iter
logic : 43.15 cycles/iter
No new test cases with this patch because:
1. fast-isel-select-sse.ll tests the positive side for regular X86 lowering and fast-isel
2. sse-minmax.ll and fp-select-cmp-and.ll confirm that we're not firing for scalar selects without AVX
3. fp-select-cmp-and.ll and logical-load-fold.ll confirm that we're not firing for scalar selects with constants.
http://llvm.org/bugs/show_bug.cgi?id=22483
Differential Revision: http://reviews.llvm.org/D8063
llvm-svn: 231408
This is what all the targets check for and is consistent with the
initialized value of MissingFeatures, which is sometimes assinged
to ErrorInfo.
llvm-svn: 231397
This commit enables forming vector extloads for ARM.
It only does so for legal types, and when we can't fold the extension
in a wide/long form of the user instruction.
Enabling it for larger types isn't as good an idea on ARM as it is on
X86, because:
- we pretend that extloads are legal, but end up generating vld+vmov
- we have instructions like vld {dN, dM}, which can't be generated
when we "manually expand" extloads to vld+vmov.
For legal types, the combine doesn't fire that often: in the
integration tests only in a big endian testcase, where it removes a
pointless AND.
Related to rdar://19723053
Differential Revision: http://reviews.llvm.org/D7423
llvm-svn: 231396
We maintain a map from symbols to archive files for the archive file
pre-loading. That map is created at the beginning of the resolve()
and is never updated. However, the input file list may be updated by
File::beforeLink(). This is a patch to update the map after beforeLink.
llvm-svn: 231395
Summary:
LLDB driver was simply tacking quotes around the strings in lldb commands, hoping that will work.
This changes it to properly escape quotes and backslashes.
Reviewers: clayborg
Subscribers: lldb-commits
Differential Revision: http://reviews.llvm.org/D8083
llvm-svn: 231394
This will be followed by a change on the clang side to update
the only user of this function with the new version.
Differential Revision: http://reviews.llvm.org/D8074
Reviewed By: Reid Kleckner
llvm-svn: 231392
The first element of STACKFRAME64 is a struct and Clang wants us to put
braces around it's initialization. Instead, drop the zero. The result
should be the same.
llvm-svn: 231387
llvm::sys::PrintBacktrace(FILE*) is supposed to print a backtrace
of the current thread given the current PC. This function was
unimplemented on Windows, and instead the only time we could
print a backtrace was as the result of an exception through
LLVMUnhandledExceptionFilter.
This patch implements backtracing of self by using
RtlCaptureContext to get a CONTEXT for the current thread, and
moving the printing and StackWalk64 code to a common method that
printing own stack trace and printing stack trace of an exception
can use.
Differential Revision: http://reviews.llvm.org/D8068
Reviewed by: Reid Kleckner
llvm-svn: 231382
Currently shuffles may only be combined if they are of the same type, despite the fact that bitcasts are often introduced in between shuffle nodes (e.g. x86 shuffle type widening).
This patch allows a single input shuffle to peek through bitcasts and if the input is another shuffle will merge them, shuffling using the smallest sized type, and re-applying the bitcasts at the inputs and output instead.
Dropped old ShuffleToZext test - this patch removes the use of the zext and vector-zext.ll covers these anyhow.
Differential Revision: http://reviews.llvm.org/D7939
llvm-svn: 231380
We need to reinterpret float/double types as uint/ulong in order to
perform the bitwise operations.
This has been tested with piglit, OpenCV, and the ocl conformance tests.
v2:
- Use vector operations rather than splitting vectors into scalar
components.
Reviewed-by: Aaron Watry <awatry@gmail.com>
llvm-svn: 231373
Added lowering for ISD::CONCAT_VECTORS and ISD::INSERT_SUBVECTOR for i1 vectors,
it is needed to pass all masked_memop.ll tests for SKX.
llvm-svn: 231371
Summary:
Replace unrecognized namespace ending comments. This will help in particular when a namespace ending comment is mistyped or doesn't fit the regexp for other reason, e.g.:
namespace a {
namespace b {
namespace {
} // anoynmous namespace
} // b
} // namesapce a
Reviewers: djasper
Reviewed By: djasper
Subscribers: curdeius, cfe-commits
Differential Revision: http://reviews.llvm.org/D8078
llvm-svn: 231369
Long story short: stop-the-world briefly resets SIGSEGV handler to SIG_DFL.
This breaks programs that handle and continue after SIGSEGV (namely JVM).
See the test and comments for details.
This is reincarnation of reverted r229678 (http://reviews.llvm.org/D7722).
Changed:
- execute TracerThreadDieCallback only on tracer thread
- reset global data in TracerThreadSignalHandler/TracerThreadDieCallback
- handle EINTR from waitpid
Add 3 new test:
- SIGSEGV during leak checking
- StopTheWorld operation during signal storm from an external process
- StopTheWorld operation when the program generates and handles SIGSEGVs
http://reviews.llvm.org/D8032
llvm-svn: 231367
Also it extracts getCopyFromRegs helper function in SelectionDAGBuilder as we need to be able to customize type of the register exported from basic block during lowering of the gc.result.
llvm-svn: 231366
Build time (user time) for building llvm+clang+lldb in release mode:
- default allocator: 9086 seconds
- with PBQP: 9126 seconds
- with PBQP + local bit matrix cache: 9097 seconds
llvm-svn: 231360
isNormalFp and isFiniteNonZeroFp should not assume vector operands can not be constant expressions.
Patch by Pawel Jurek <pawel.jurek@intel.com>
Differential Revision: http://reviews.llvm.org/D8053
llvm-svn: 231359
This should help with the AVX512 masked gather changes Elena is working on. This patch is derived from some of the changes Elena made to tablegen, but modified by me to support arbitrary number of results.
llvm-svn: 231357