As noted in the test comment, instcombine now produces the masked
shift value even when it's not included in the source, so we should
handle this.
Although the AMD/Intel docs don't say it explicitly, over-rotating
the narrow ops produces the same results. An existence proof that
this works as expected on all x86 comes from gcc 4.9 or later:
https://godbolt.org/g/K6rc1A
llvm-svn: 310770
Summary: This is to support Android where libc++abi is part of libc++.
Reviewers: srhines, EricWF
Subscribers: dberris, mgorny, llvm-commits
Differential Revision: https://reviews.llvm.org/D36640
llvm-svn: 310769
Summary:
The stack alignment depends on the ABI (16 bytes for N32 and N64 and 8
bytes for O32), not the CPU type.
Reviewers: sdardis
Reviewed By: sdardis
Subscribers: atanasyan, arichardson, llvm-commits
Differential Revision: https://reviews.llvm.org/D36326
llvm-svn: 310768
Updating remark API to newer OptimizationDiagnosticInfo API. This
allows remarks to show up in diagnostic yaml file, and enables use
of opt-viewer tool.
Hotness information for remarks (L505 and L751) do not display hotness
information, most likely due to profile information not being
propagated yet. Unsure if this is the desired outcome.
Patch by Tarun Rajendran.
Differential Revision: https://reviews.llvm.org/D36127
llvm-svn: 310763
Summary:
Previously we would use these instructions if sse was disabled and fastmath was enabled.
As mentioned in D28335, this is a bad idea.
Reviewers: efriedma, scanon, DavidKreitzer
Reviewed By: DavidKreitzer
Subscribers: zvi, llvm-commits
Differential Revision: https://reviews.llvm.org/D36344
llvm-svn: 310762
This improves readability and (theoretically) improves portability,
as _Ugly names are reserved.
This performs additional de-uglification, so all of these tests
follow the example of iterator.traits/empty.pass.cpp.
llvm-svn: 310761
When the access to a weak symbol is not a call, the access has to be
able to produce the value 0 at runtime.
We were sometimes producing code sequences where that was not possible
if the code was leaded more than 4g away from 0.
llvm-svn: 310756
The linker module contains a symbol of type S_COMPILE3 which
contains various information about the compiler and linker used
to create the PDB, such as the name of the linker, the target
machine, and the linker version. Interestingly, if we set the
version string to 0.0.0.0, then when trying to view local
variables WinDbg emits an error that private symbols are not
present. By setting this to a valid MSVC linker version string,
local variables can display.
As such, even though it is not representative of LLVM's version
information, we need this for compatibility.
llvm-svn: 310755
PDBs need to contain 1 module for each object file/compiland,
and a special one synthesized by the linker. This one contains
a symbol record for each output section in the executable with
its address information. This patch adds such symbols to the
linker module. Note that we also are supposed to add an
S_COFFGROUP symbol for what appears to be each input section that
contributes to each output section, but it's not entirely clear
how to generate these yet, so I'm leaving that for a separate
patch.
llvm-svn: 310754
Two of the Windows bots are failing test\CodeGen\X86\GlobalISel\select-inc.mir
which should not have been affected by the change. Reverting while I investigate.
Also reverted r310735 because it builds on r310716.
llvm-svn: 310745
Previously we were writing an empty globals stream. Windows
tools interpret this as "private symbols are not present in
this PDB", even when they are, so we need to fix this. Regardless,
without it we don't have information about global variables, so
we need to fix it anyway. This patch does that.
With this patch, the "lm" command in WinDbg correctly reports
that we have private symbols available, but the "dv" command
still refuses to display local variables.
Differential Revision: https://reviews.llvm.org/D36535
llvm-svn: 310743
Summary:
When using Python 3, `pygments.highlight()` returns a `bytes` object, not
a `str`, causing the call to `str.replace` on the following line to fail
with a runtime exception:
`TypeError: 'str' does not support the buffer interface`. Decode the
bytes into a string in order to fix the exception.
Test Plan:
Run `opt-viewer.py` with Python 3.4, and confirm no runtime error occurs
when calling `str.replace`.
Reviewers: anemet
Reviewed By: anemet
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D36624
llvm-svn: 310741
Summary:
Replace a usage of a Python 2-specific `dict.iteritems()` with the
Python 3-compatible definition provided at the top of the same file.
Test Plan:
Run `opt-viewer.py` using Python 3 and confirm it no longer encounters a
runtime error when calling `dict.iteritems()`.
Reviewers: anemet
Reviewed By: anemet
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D36623
llvm-svn: 310740
Summary:
In Python 2, `intern()` is a builtin function available to all programs.
In Python 3, it was moved into the `sys` module, available as
`sys.intern`. Import it such that, within `optrecord.py`, `intern()` is
available whether run using Python 2 or 3.
Test Plan:
Run `opt-viewer.py` using Python 3, confirm it no longer
encounters a runtime error when `intern()` is called.
Reviewers: anemet
Reviewed By: anemet
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D36622
llvm-svn: 310739
Summary:
Generate the type table from the types used by a target rather than hard-coding
the union of types used by all targets.
Depends on D36084
Reviewers: ab, t.p.northover, qcolombet, rovka, aditya_nandakumar
Reviewed By: rovka
Subscribers: kristof.beyls, igorb, llvm-commits
Differential Revision: https://reviews.llvm.org/D36085
llvm-svn: 310735
The flag will perform instrumentation necessary to the fuzzing,
but will NOT link libLLVMFuzzer.a library.
Necessary when modifying CFLAGS for projects which may produce
executables as well as a fuzzable target.
Differential Revision: https://reviews.llvm.org/D36600
llvm-svn: 310733
The pass does simplifications of well known AMD library calls.
If given -amdgpu-prelink option it works in a pre-link mode which
allows to reference new library functions which will be linked in
later.
In addition it also used to process traditional AMD option
-fuse-native which allows to replace some of the functions with
their fast native implementations from the library.
The necessary glue to pass the prelink option and translate
-fuse-native is to be added to the driver.
Differential Revision: https://reviews.llvm.org/D36436
llvm-svn: 310731
Summary:
This autoupgrades most of the broadcast intrinsics. They've been unused in clang for some time.
This leaves the 32x2 intrinsics because they are still used in clang.
Reviewers: RKSimon, zvi, igorb
Reviewed By: RKSimon
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D36606
llvm-svn: 310725
Summary:
This teaches 512-bit shuffles to detect unused halfs in order to reduce shuffle size.
We may need to refine the 512-bit exit point. I couldn't remember if we had good cross lane shuffles for 8/16 bit with AVX-512 or not.
I believe this is step towards being able to handle D36454 without a special case.
From here we need to improve our ability to combine extract_subvector with insert_subvector and other extract_subvectors. And we need to support narrowing binary operations where we don't demand all elements. This may be improvements to DAGCombiner::narrowExtractedVectorBinOp(by recognizing an insert_subvector in addition to concat) or we may need a target specific combiner.
Reviewers: RKSimon, zvi, delena, jbhateja
Reviewed By: RKSimon, jbhateja
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D36601
llvm-svn: 310724
Create a separate test file to contain all tests for OpenMP
offloading to GPUs.
Make libdevice checking more robust by accounting for
the case in which no libdevice is found.
This changes are in connrection with diff: D29660
llvm-svn: 310718
The previous rev (r310208) failed to account for overflow when subtracting the
constants to see if they're suitable for shift/lea. This version add a check
for that and more test were added in r310490.
We can convert any select-of-constants to math ops:
http://rise4fun.com/Alive/d7d
For this patch, I'm enhancing an existing x86 transform that uses fake multiplies
(they always become shl/lea) to avoid cmov or branching. The current code misses
cases where we have a negative constant and a positive constant, so this is just
trying to plug that hole.
The DAGCombiner diff prevents us from hitting a terrible inefficiency: we can start
with a select in IR, create a select DAG node, convert it into a sext, convert it
back into a select, and then lower it to sext machine code.
Some notes about the test diffs:
1. 2010-08-04-MaskedSignedCompare.ll - We were creating control flow that didn't exist in the IR.
2. memcmp.ll - Choose -1 or 1 is the case that got me looking at this again. We could avoid the
push/pop in some cases if we used 'movzbl %al' instead of an xor on a different reg? That's a
post-DAG problem though.
3. mul-constant-result.ll - The trade-off between sbb+not vs. setne+neg could be addressed if
that's a regression, but those would always be nearly equivalent.
4. pr22338.ll and sext-i1.ll - These tests have undef operands, so we don't actually care about these diffs.
5. sbb.ll - This shows a win for what is likely a common case: choose -1 or 0.
6. select.ll - There's another borderline case here: cmp+sbb+or vs. test+set+lea? Also, sbb+not vs. setae+neg shows up again.
7. select_const.ll - These are motivating cases for the enhancement; replace cmov with cheaper ops.
Assembly differences between movzbl and xor to avoid a partial reg stall are caused later by the X86 Fixup SetCC pass.
Differential Revision: https://reviews.llvm.org/D35340
llvm-svn: 310717
Summary:
Support the case where an operand of a pattern is also the whole of the
result pattern. In this case the original result and all its uses must be
replaced by the operand. However, register class restrictions can require
a COPY. This patch handles both cases by always emitting the copy and
leaving it for the register allocator to optimize.
Depends on D35833
Reviewers: ab, t.p.northover, qcolombet, rovka, aditya_nandakumar
Subscribers: javed.absar, kristof.beyls, igorb, llvm-commits
Differential Revision: https://reviews.llvm.org/D36084
llvm-svn: 310716
As clang defaults to -mno-abicalls when using -fno-pic for N64, implicitly use
-mgpopt in that case.
Reviewers: atanasyan
Differential Revision: https://reviews.llvm.org/D36315
llvm-svn: 310714
Post commit review of rL308619 highlighted the need for handling N64
with -fno-pic. Testing reveale a stale assert when generating a GP
relative addressing mode.
This patch removes that assert and adds the necessary patterns for
MIPS64 to perform gp relative addressing with -fno-pic
(and the implicit -mno-abicalls + -mgpopt).
Reviewers: atanasyan, nitesh.jain
Differential Revision: https://reviews.llvm.org/D36472
llvm-svn: 310713
Expose the dependencies of LLVMExecutionEngine library as PUBLIC rather
than PRIVATE when building a shared library. This is necessary because
the library is not contained but exposes API of other LLVM libraries via
its headers.
This causes other libraries to fail to link if the linker verifies for
correctness of -l flags (i.e. fails on indirect dependencies). This e.g.
happens when building LLDB against shared LLVM:
lib64/liblldbExpression.a(IRExecutionUnit.cpp.o):(.data.rel.ro._ZTIN4llvm18MCJITMemoryManagerE[_ZTIN4llvm18MCJITMemoryManagerE]+0x10): undefined reference to `typeinfo for llvm::RuntimeDyld::MemoryManager'
lib64/liblldbExpression.a(IRExecutionUnit.cpp.o):(.data.rel.ro._ZTVN4llvm18MCJITMemoryManagerE[_ZTVN4llvm18MCJITMemoryManagerE]+0x60): undefined reference to `llvm::RuntimeDyld::MemoryManager::anchor()'
lib64/liblldbExpression.a(IRExecutionUnit.cpp.o):(.data.rel.ro._ZTVN12lldb_private15IRExecutionUnit13MemoryManagerE[_ZTVN12lldb_private15IRExecutionUnit13MemoryManagerE]+0x48): undefined reference to `llvm::RTDyldMemoryManager::deregisterEHFrames()'
lib64/liblldbExpression.a(IRExecutionUnit.cpp.o):(.data.rel.ro._ZTVN12lldb_private15IRExecutionUnit13MemoryManagerE[_ZTVN12lldb_private15IRExecutionUnit13MemoryManagerE]+0x60): undefined reference to `llvm::RuntimeDyld::MemoryManager::anchor()'
lib64/liblldbExpression.a(IRExecutionUnit.cpp.o):(.data.rel.ro._ZTVN12lldb_private15IRExecutionUnit13MemoryManagerE[_ZTVN12lldb_private15IRExecutionUnit13MemoryManagerE]+0xd0): undefined reference to `llvm::JITSymbolResolver::anchor()'
collect2: error: ld returned 1 exit status
Declaring the dependencies as PUBLIC guarantees that any package using
the ExecutionEngine library will also get explicit -l flags for
the dependent libraries guaranteeing that the symbols exposed in headers
could be resolved.
Patch originally written by NAKAMURA Takumi.
Differential Revision: https://reviews.llvm.org/D36211
llvm-svn: 310712
Move store merge to happen after intrinsic lowering to allow lowered
stores to be merged.
Some regressions due in MergeConsecutiveStores to missing
insert_subvector that are addressed in follow up patch.
Reviewers: craig.topper, efriedma, RKSimon
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D34559
llvm-svn: 310710