The SIInsertWaits pass was overwriting the first operand (gds bit) of
DS_WRITE_B32 with the second operand (value to write). This meant that
any time the value to write was stored in an odd number VGPR, the gds
bit would be set causing the instruction to write to GDS instead of LDS.
llvm-svn: 188522
- Instead of setting the suffixes in a bunch of places, just set one master
list in the top-level config. We now only modify the suffix list in a few
suites that have one particular unique suffix (.ml, .mc, .yaml, .td, .py).
- Aside from removing the need for a bunch of lit.local.cfg files, this enables
4 tests that were inadvertently being skipped (one in
Transforms/BranchFolding, a .s file each in DebugInfo/AArch64 and
CodeGen/PowerPC, and one in CodeGen/SI which is now failing and has been
XFAILED).
- This commit also fixes a bunch of config files to use config.root instead of
older copy-pasted code.
llvm-svn: 188513
When both constants are positive or both constants are negative,
InstCombine already simplifies comparisons like this, but when
it's exactly zero and -1, the operand sorting ends up reversed
and the pattern fails to match. Handle that special case.
Follow up for rdar://14689217
llvm-svn: 188512
This tweaks the CMake rules for building an installation package on Windows:
- Sets license file (otherwise nsis shows an ugly default)
- Adds LLVM logo
- Shows "do you want to add this to the system path" dialog.
Differential Revision: http://llvm-reviews.chandlerc.com/D1414
llvm-svn: 188509
same way as X86_64_GOT relocations. The 'Load' part of GOTLoad is just an
optimization hint for the linker anyway, and can be safely ignored.
This patch also fixes some minor issues with the relocations introduced while
processing an X86_64_GOT[Load]: the addend for the GOT entry should always be
zero, and the addend for the replacement relocation at the original offset
should be the same as the addend of the relocation being replaced.
I haven't come up with a good way of testing this yet, but I'm working on it.
This fixes <rdar://problem/14651564>.
llvm-svn: 188499
Before this patch this flag is IOS specific, but is also
useful for bare project like bootloaders / kernels etc,
since movw / movt prevents simple relocation. Therefore
make this flag more commonly available.
note: this patch depends on a similiar rename in clang
Patch by Jeroen Hofstee.
llvm-svn: 188487
r9 is defined as a platform-specific register in the ARM EABI.
It can be reserved for a special purpose or be used as a general
purpose register. Add support for reserving r9 for all ARM, while
leaving the IOS usage unchanged.
Patch by Jeroen Hofstee.
llvm-svn: 188485
Summary:
When the -dfsan-debug-nonzero-labels parameter is supplied, the code
is instrumented such that when a call parameter, return value or load
produces a nonzero label, the function __dfsan_nonzero_label is called.
The idea is that a debugger breakpoint can be set on this function
in a nominally label-free program to help identify any bugs in the
instrumentation pass causing labels to be introduced.
Reviewers: eugenis
CC: llvm-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D1405
llvm-svn: 188472
1. The offset range for Thumb1 PC relative loads is [0..1020] and not [-1024..1020]
2. Thumb2 PC relative loads may define the PC, so the restriction placed on target register is removed
3. Removes unneeded alias between "ldr.n" and t1LDRpci. ".n" is actually stripped by both tablegen
and the ASM parser, so this alias rule really does nothing
llvm-svn: 188466
We were marking both LLVMBUILDOUTPUT and LLVMBUILDERRORS as
ERROR_VARIABLES when clearly LLVMBUILDOUTPUT should be marked as
OUTPUT_VARIABLE.
llvm-svn: 188444
When new virtual registers are created during splitting/spilling, defer
creation of the live interval until we need to use the live interval.
Along with the recent commits to notify LiveRangeEdit when new virtual
registers are created, this makes it possible for functions like
TargetInstrInfo::loadRegFromStackSlot() and
TargetInstrInfo::storeRegToStackSlot() to create multiple virtual
registers as part of the process of generating loads/stores for
different register classes, and then have the live intervals for those
new registers computed when they are needed.
llvm-svn: 188437
MachineInstrSpan is initialized with a MachineBasicBlock::iterator,
and is intended to track which instructions are inserted before/after
that instruction from the time the MachineInstrSpan is created.
It provides a begin()/end() interface to walk the range of
instructions inserted around the initial instruction (including that
initial instruction).
It also provides a getInitial() interface to return the initial
iterator.
llvm-svn: 188436
Add a delegate class to MachineRegisterInfo with a single virtual
function, MRI_NoteNewVirtualRegister(). Update LiveRangeEdit to inherit
from this delegate class and override the definition of the callback
with an implementation that tracks the newly created virtual registers.
llvm-svn: 188435
Track new virtual registers by register number, rather than by the live
interval created for them. This is the first step in separating the
creation of new virtual registers and new live intervals. Eventually
live intervals will be created and populated on demand after the virtual
registers have been created and used in instructions.
llvm-svn: 188434
Now that compute support is better on SI, we can't continue using v16i8
for descriptors since this is also a legal type in OpenCL.
This patch fixes numerous hangs with the piglit OpenCL test and since
we now use a target specific DAG node for LOAD_CONSTANT with the
correct MemOperandFlags, this should also fix:
https://bugs.freedesktop.org/show_bug.cgi?id=66805
llvm-svn: 188429
Using REG_SEQUENCE for BUILD_VECTOR rather than a series of INSERT_SUBREG
instructions should make it easier for the register allocator to coalasce
unnecessary copies.
v2:
- Use an SGPR register class if all the operands of BUILD_VECTOR are
SGPRs.
llvm-svn: 188427
The instruction selector will now try to infer the destination register
so it can decided whether to use V_MOV_B32 or S_MOV_B32 when copying
immediates.
llvm-svn: 188426
The previous code declared the operand as unknown:$vaddr, which made
it possible for scalar registers to be used instead of vector registers.
llvm-svn: 188425
This is a follow-up to r187693, correcting that code to request the correct
register class. The previous version, with the wrong register class, was not
really correcting the constraints, but rather was removing them. Coincidentally,
this fixed the failing test case in r187693, but obviously created other
problems.
llvm-svn: 188407
This replaces the old incomplete greylist functionality with an ABI
list, which can provide more detailed information about the ABI and
semantics of specific functions. The pass treats every function in
the "uninstrumented" category in the ABI list file as conforming to
the "native" (i.e. unsanitized) ABI. Unless the ABI list contains
additional categories for those functions, a call to one of those
functions will produce a warning message, as the labelling behaviour
of the function is unknown. The other supported categories are
"functional", "discard" and "custom".
- "discard" -- This function does not write to (user-accessible) memory,
and its return value is unlabelled.
- "functional" -- This function does not write to (user-accessible)
memory, and the label of its return value is the union of the label of
its arguments.
- "custom" -- Instead of calling the function, a custom wrapper __dfsw_F
is called, where F is the name of the function. This function may wrap
the original function or provide its own implementation.
Differential Revision: http://llvm-reviews.chandlerc.com/D1345
llvm-svn: 188402
- For whatever reason, we have a lot of test files with bogus unicode
characters. This patch allows those scripts to still be parsed on Python3 by
changing the parsing logic to work on binary files, and only require the
actual script commands to be convertible to ascii.
- This patch has been tweaked to now ensure that the command strings are not of
unicode type on Python 2.6-7.
llvm-svn: 188398
When determining if two different loads are from the same base address,
this patch allows one load to use a t2LDRi8 address mode and another to
use a t2LDRi12 address mode. The current implementation is very
conservative and this allows the case of differing Thumb2 byte loads to
be considered. Allowing these differing modes instead of forcing the exact
same opcode is useful for situations where one opcodes loads from a base
address+1 and a second opcode loads for a base address-1.
Patch by Daniel Stewart.
llvm-svn: 188385
- For whatever reason, we have a lot of test files with bogus unicode
characters. This patch allows those scripts to still be parsed on Python3 by
changing the parsing logic to work on binary files, and only require the
actual script commands to be convertible to ascii.
llvm-svn: 188376
It's useful to be able to write down floating-point numbers without having to
worry about what they'll be rounded to (as C99 discovered), this extends that
ability to the MC assembly parsers.
llvm-svn: 188370
extremely subtle miscompilations (such as a load getting replaced with
the value stored *below* the load within a basic block) related to
promoting an alloca to an SSA value, there is the dim possibility that
you hit this. Please let me know if you won this unfortunate lottery.
The first half of mem2reg's core logic (as it is used both in the
standalone mem2reg pass and in SROA) builds up a mapping from
'Instruction *' to the index of that instruction within its basic block.
This allows quickly establishing which store dominate a particular load
even for large basic blocks. We cache this information throughout the
run of mem2reg over a function in order to amortize the cost of
computing it.
This is not in and of itself a strange pattern in LLVM. However, it
introduces a very important constraint: absolutely no instruction can be
deleted from the program without updating the mapping. Otherwise a newly
allocated instruction might get the same pointer address, and then end
up with a wrong index. Yes, LLVM routinely suffers from a *single
threaded* variant of the ABA problem. Most places in LLVM don't find
avoiding this an imposition because they don't both delete and create
new instructions iteratively, but mem2reg *loves* to do this... All the
time. Fortunately, the mem2reg code was really careful about updating
this cache to handle this eventuallity... except when it comes to the
debug declare intrinsic. Oops. The fix is to invalidate that pointer in
the cache when we delete it, the same as we do when deleting alloca
instructions and other instructions.
I've also caused the same bug in new code while working on a fix to
PR16867, so this seems to be a really unfortunate pattern. Hopefully in
subsequent patches the deletion of dead instructions can be consolidated
sufficiently to make it less likely that we'll see future occurences of
this bug.
Sorry for not having a test case, but I have literally no idea how to
reliably trigger this kind of thing. It may be single-threaded, but it
remains an ABA problem. It would require a really amazing number of
stars to align.
llvm-svn: 188367
Use the pointer size if datalayout is available.
Use i64 if it's not, which is consistent with what other
places do when the pointer size is unknown.
The test doesn't really test this in a useful way
since it will be transformed to that later anyway,
but this now tests it for non-zero arrays and when
datalayout isn't available. The cases in
visitGetElementPtrInst should save an extra re-visit to
the newly created GEP since it won't need to cleanup after
itself.
llvm-svn: 188339
When computing the use set of a store, we need to add the store to the write
set prior to iterating over later instructions. Otherwise, if there is a later
aliasing load of that store, that load will not be tagged as a use, and bad
things will happen.
trackUsesOfI still adds later dependent stores of an instruction to that
instruction's write set, but it never sees the original instruction, and so
when tracking uses of a store, the store must be added to the write set by the
caller.
Fixes PR16834.
llvm-svn: 188329
However, opt -O2 doesn't run mem2reg directly so nobody noticed until r188146
when SROA started sending more things directly down the PromoteMemToReg path.
In order to revert r187191, I also revert dependent revisions r187296, r187322
and r188146. Fixes PR16867. Does not add the testcases from that PR, but both
of them should get added for both mem2reg and sroa when this revert gets
unreverted.
llvm-svn: 188327
Clients of the option parsing library should handle it explicitly
using a KIND_REMAINING_ARGS option.
Clang and lld have been updated in r188316 and r188318, respectively.
Also fix -Wsign-compare warning in the option parsing test.
llvm-svn: 188323
A common idiom is to use zero and all-ones as sentinal values and to
check for both in a single conditional ("x != 0 && x != (unsigned)-1").
That generates code, for i32, like:
testl %edi, %edi
setne %al
cmpl $-1, %edi
setne %cl
andb %al, %cl
With this transform, we generate the simpler:
incl %edi
cmpl $1, %edi
seta %al
Similar improvements for other integer sizes and on other platforms. In
general, combining the two setcc instructions into one is better.
rdar://14689217
llvm-svn: 188315
This adds KIND_REMAINING_ARGS, a class of options that consume
all remaining arguments on the command line.
This will be used to support /link in clang-cl, which is used
to forward all remaining arguments to the linker.
It also allows us to remove the hard-coded handling of "--",
allowing clients (clang and lld) to implement that functionality
themselves with this new option class.
Differential Revision: http://llvm-reviews.chandlerc.com/D1387
llvm-svn: 188314
* msa SubtargetFeature
* registers
* ld.[bhwd], and st.[bhwd] instructions
Does not correctly prohibit use of both 32-bit FPU registers and MSA together.
Patch by Daniel Sanders
llvm-svn: 188313
LowerCallTo returns a pair with the return value of the call as the first
element and the chain associated with the return value as the second element. If
we lower a call that has a void return value, LowerCallTo returns an SDValue
with a NULL SDNode and the chain for the call. Thus makeLibCall by just
returning the first value makes it impossible for you to set up the chain so
that the call is not eliminated as dead code.
I also updated all references to makeLibCall to reflect the new return type.
llvm-svn: 188300
Summary:
We need to do two things:
- Initialize BSSSection in MCObjectFileInfo::InitCOFFMCObjectFileInfo
- Teach TargetLoweringObjectFileCOFF::SelectSectionForGlobal what to do
with it
This fixes PR16861.
Reviewers: rnk
Reviewed By: rnk
CC: llvm-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D1361
llvm-svn: 188244
CUs.
Currently only hashes the name of CUs and the names of any children,
but it's an obvious first step to show the framework. The testcase
should continue to be correct, however, as it's an empty TU.
llvm-svn: 188243
to find loops if the From and To instructions were in the same block.
Refactor the code a little now that we need to fill to start the CFG-walking
algorithm with more than one starting basic block sometimes.
Special thanks to Andrew Trick for catching an error in my understanding of
natural loops in code review.
llvm-svn: 188236
FileCheck should check to make sure the prefix was found, and not a word
containing it (e.g -check-prefix=BASEREL shouldn't match NOBASEREL).
Patch by Ron Ofir.
llvm-svn: 188221
R600 doesn't need to do any scheduling on the SelectionDAG now that it
has a very good MachineScheduler. Also, using the VLIW SelectionDAG
scheduler was having a major impact on compile times. For example with
the phatk kernel here are the LLVM IR to machine code compile times:
With Sched::VLIW
Total Compile Time: 1.4890 Seconds (User + System)
SelectionDAG Instruction Scheduling: 1.1670 Seconds (User + System)
With Sched::Source
Total Compile Time: 0.3330 Seconds (User + System)
SelectionDAG Instruction Scheduling: 0.0070 Seconds (User + System)
The code ouput was identical with both schedulers. This may not be true
for all programs, but it gives me confidence that there won't be much
reduction, if any, in code quality by using Sched::Source.
llvm-svn: 188215
In order to appease people (in Apple) who accuse me for committing "huge change" (?) without proper review.
Thank Eric for fixing a compile-warning.
llvm-svn: 188204