Summary:
This test is allowed to run on non-x86 hosts and thus must use
llvm-nm rather than nm.
Differential Revision: https://reviews.llvm.org/D25473
llvm-svn: 283901
The non-obvious motivation for adding this fold (which already happens in InstCombine)
is that we want to canonicalize IR towards select instructions and canonicalize DAG
nodes towards boolean math. So we need to recreate some folds in the DAG to handle that
change in direction.
An interesting implementation difference for cases like this is that InstCombine
generally works top-down while the DAG goes bottom-up. That means we need to detect
different patterns. In this case, the SimplifyDemandedBits fold prevents us from
performing a zext to sext fold that would then be recognized as a negation of a sext.
llvm-svn: 283900
Previously we would print
USAGE: <exe> [subcommand] [options]
Even if no subcommands were present. This changes the output
format to only print "[subcommand]" if there is at least one
subcommand.
Fixes llvm.org/pr30598
Patch by Serge Guelton
llvm-svn: 283892
For each block check that it doesn't have any uses outside of it's innermost loop.
Differential Revision: https://reviews.llvm.org/D25364
llvm-svn: 283877
The high registers are not allocatable in Thumb1 functions, but they
could still be used by inline assembly, so we need to save and restore
the callee-saved high registers (r8-r11) in the prologue and epilogue.
This is complicated by the fact that the Thumb1 push and pop
instructions cannot access these registers. Therefore, we have to move
them down into low registers before pushing, and move them back after
popping into low registers.
In most functions, we will have low registers that are also being
pushed/popped, which we can use as the temporary registers for
saving/restoring the high registers. However, this is not guaranteed, so
we may need to push some extra low registers to ensure that the high
registers can be saved/restored. For correctness, it would be sufficient
to use just one low register, but if we have enough low registers
available then we only need one push/pop instruction, rather than one
per high register.
We can also use the argument/return registers when they are not live,
and the link register when saving (but not restoring), reducing the
number of extra registers we need to push.
There are still a few extreme edge cases where we need two push/pop
instructions, because not enough low registers can be made live in the
prologue or epilogue.
In addition to the regression tests included here, I've also tested this
using a script to generate functions which clobber different
combinations of registers, have different numbers of argument and return
registers (including variadic arguments), allocate different fixed sized
objects on the stack, and do or don't use variable sized allocas and the
__builtin_return_address intrinsic (all of which affect the available
registers in the prologue and epilogue). I ran these functions in a test
harness which verifies that all of the callee-saved registers are
correctly preserved.
Differential Revision: https://reviews.llvm.org/D24228
llvm-svn: 283867
Currently, the Int_eh_sjlj_dispatchsetup intrinsic is marked as
clobbering all registers, including floating-point registers that may
not be present on the target. This is technically true, as we could get
linked against code that does use the FP registers, but that will not
actually work, as the soft-float code cannot save and restore the FP
registers. SjLj exception handling can only work correctly if either all
or none of the code is built for a target with FP registers. Therefore,
we can assume that, when Int_eh_sjlj_dispatchsetup is compiled for a
soft-float target, it is only going to be linked against other
soft-float code, and so only clobbers the general-purpose registers.
This allows us to check that no non-savable registers are clobbered when
generating the prologue/epilogue.
Differential Revision: https://reviews.llvm.org/D25180
llvm-svn: 283866
Allow instructions such as 'cmp w0, #(end - start)' by folding the
expression into a constant. For ELF, we fold only if the symbols are in
the same section. For MachO, we fold if the expression contains only
symbols that are not linker visible.
Fixes https://llvm.org/bugs/show_bug.cgi?id=18920
Differential Revision: https://reviews.llvm.org/D23834
llvm-svn: 283862
Bot does not like it: http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/17075
/mnt/b/sanitizer-buildbot3/sanitizer-x86_64-linux-fast/build/llvm/test/Object/invalid.test:70:32: error: expected string not found in input
INVALID-SEC-ADDRESS-ALIGNMENT: Invalid address alignment of section headers
^
<stdin>:1:1: note: scanning from here
/mnt/b/sanitizer-buildbot3/sanitizer-x86_64-linux-fast/build/llvm/include/llvm/Object/ELF.h:412:7: runtime error: upcast of misaligned address 0x000002d8b899 for type 'llvm::object::Elf_Shdr_Impl<llvm::object::ELFType<llvm::support::endianness::little, true> >', which requires 2 byte alignment
^
<stdin>:1:125: note: possible intended match here
/mnt/b/sanitizer-buildbot3/sanitizer-x86_64-linux-fast/build/llvm/include/llvm/Object/ELF.h:412:7: runtime error: upcast of misaligned address 0x000002d8b899 for type 'llvm::object::Elf_Shdr_Impl<llvm::object::ELFType<llvm::support::endianness::little, true> >', which requires 2 byte alignment
llvm-svn: 283858
This reverts commit r283842.
test/CodeGen/X86/tail-dup-repeat.ll causes and llc crash with our
internal testing. I'll share a link with you.
llvm-svn: 283857
LLVM's RandomNumberGenerator wasn't compatible with
the random distribution from <random>.
Fixes PR25105
Patch by: Serge Guelton <serge.guelton@telecom-bretagne.eu>
Differential Revision: https://reviews.llvm.org/D25443
llvm-svn: 283854
Summary: This patch sets function as hot if function's entry count is hot/cold.
Reviewers: eraman, davidxl
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D25048
llvm-svn: 283852
This changes MachineRegisterInfo to be initializes after parsing all
instructions. This is in preparation for upcoming commits that allow the
register class specification on the operand or deduce them from the
MCInstrDesc.
This commit removes the unused feature of having nonsequential register
numbers. This was confusing anyway as the vreg numbers would be
different after parsing when you had "holes" in your numbering.
This patch also introduces the concept of an incomplete virtual
register. An incomplete virtual register may be used during .mir parsing
to construct MachineOperands without knowing the exact register class
(or register bank) yet.
NFC except for some error messages.
Differential Revision: https://reviews.llvm.org/D22397
llvm-svn: 283848
The tail duplication pass uses an assumed layout when making duplication
decisions. This is fine, but passes up duplication opportunities that
may arise when blocks are outlined. Because we want the updated CFG to
affect subsequent placement decisions, this change must occur during
placement.
In order to achieve this goal, TailDuplicationPass is split into a
utility class, TailDuplicator, and the pass itself. The pass delegates
nearly everything to the TailDuplicator object, except for looping over
the blocks in a function. This allows the same code to be used for tail
duplication in both places.
This change, in concert with outlining optional branches, allows
triangle shaped code to perform much better, esepecially when the
taken/untaken branches are correlated, as it creates a second spine when
the tests are small enough.
Issue from previous rollback fixed, and a new test was added for that
case as well. Issue was worklist/scheduling/taildup issue in layout.
Issue from 2nd rollback fixed, with 2 additional tests. Issue was
tail merging/loop info/tail-duplication causing issue with loops that share
a header block.
Issue with early tail-duplication of blocks that branch to a fallthrough
predecessor fixed with test case: tail-dup-branch-to-fallthrough.ll
Differential revision: https://reviews.llvm.org/D18226
llvm-svn: 283842
Summary:
Previously, when allocating unspillable live ranges, we would never
attempt to split. We would always bail out and try last ditch graph
recoloring.
This patch changes this by attempting to split all live intervals before
performing recoloring.
This fixes LLVM bug PR14879.
I can't add test cases for any backends other than AVR because none of
them have small enough register classes to trigger the bug.
Reviewers: qcolombet
Subscribers: MatzeB
Differential Revision: https://reviews.llvm.org/D25070
llvm-svn: 283838
When combining an integer load with !range metadata that does not include 0 to a pointer load, make sure emit !nonnull metadata on the newly-created pointer load. This prevents the !nonnull metadata from being dropped during a ptrtoint/inttoptr pair.
This fixes PR30597.
Patch by Ariel Ben-Yehuda!
Differential Revision: https://reviews.llvm.org/D25215
llvm-svn: 283836
This only adds the support for 64-bit vector OR. Adding more sizes is
not difficult, but it requires a bigger refactoring because ORs work on
any size, not necessarly the ones that match the width of the register
width. Right now, this is not expressed in the legalization, so don't
bother pushing the refactoring yet.
llvm-svn: 283831
Previously, there is no way to create a stream other than pre-defined
special stream such as DBI or IPI. This patch adds a new method,
addDbgStream, to add a debug stream to a PDB file.
Differential Revision: https://reviews.llvm.org/D25356
llvm-svn: 283823
Traceback (most recent call last):
File "bin/llvm-lit", line 44, in <module>
lit.main(builtin_parameters)
AttributeError: 'module' object has no attribute 'main'
Suggested by Artem Belevich.
llvm-svn: 283816
This reverts commit r283798, as it causes static asserts on
MSVC 2015 with the following errors:
ArrayRefTest.cpp(38): error C2338: Assigning from single prvalue element
ArrayRefTest.cpp(41): error C2338: Assigning from single xvalue element
ArrayRefTest.cpp(47): error C2338: Assigning from an initializer list
llvm-svn: 283803
llvm::cl already has a function called llvm::apply() so this is
causing an ODR violation. The STLExtras version should win the
vote on which one gets to be called apply() since it is named
after the equivalent STL function, but since renaiming the cl
version is more difficult, let's do this for now to get the
bots green.
llvm-svn: 283800
Without this, the following statements will create ArrayRefs that
refer to temporary storage that goes out of scope by the end of the
line:
someArrayRef = getSingleElement();
someArrayRef = {elem1, elem2};
Note that the constructor still has this problem:
ArrayRef<Element> someArrayRef = getSingleElement();
ArrayRef<Element> someArrayRef = {elem1, elem2};
but that's a little harder to get rid of because we want to be able to
use this in calls:
takesArrayRef(getSingleElement());
takesArrayRef({elem1, elem2});
Part of rdar://problem/16375365. Reviewed by Duncan Exon Smith.
llvm-svn: 283798
Add integer expansion for FLT_ROUNDS_ for targets where i32 is not a legal
type.
Patch by Edward Jones, thanks!
Differential Revision: https://reviews.llvm.org/D24459
llvm-svn: 283797
This is equivalent to the C++14 std::apply(). Since we are not
using C++14 yet, this allows us to still make use of apply anyway.
Differential revision: https://reviews.llvm.org/D25100
llvm-svn: 283779
Summary: The keys must still be copyable, because we store two copies of them.
Reviewers: timshen
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D25404
llvm-svn: 283764
The instructions VLDM/VSTM can only access word-aligned memory
locations and produce alignment fault if the condition is not met.
The compiler currently generates VLDM/VSTM for v2f64 load/store
regardless the alignment of the memory access. Instead, if a v2f64
load/store is not word-aligned, the compiler should generate
VLD1/VST1. For each non double-word-aligned VLD1/VST1, a VREV
instruction should be generated when targeting Big Endian.
Differential Revision: https://reviews.llvm.org/D25281
llvm-svn: 283763
Summary:
Rotate by 1 is translated to 1 micro-op, while rotate with imm8 is translated to 2 micro-ops.
Fixes pr30644.
Reviewers: delena, igorb, craig.topper, spatel, RKSimon
Differential Revision: https://reviews.llvm.org/D25399
llvm-svn: 283758
Fixed copy+paste vector alignment to correct for per-element scalar loads
Increased to 512-bit data sizes in preparation of avx512 tests
llvm-svn: 283748
sections_begin() may return unalignment pointer when Header->e_shoff isinvalid.
That may result in a crash in clients, for example we have one in LLD:
assert((PtrWord & ~PointerBitMask) == 0 &&
"Pointer is not sufficiently aligned");
fails when trying to push_back Elf_Shdr* (unaligned) into TinyPtrVector.
Patch forces check for alignment of Header->e_shoff.
Differential revision: https://reviews.llvm.org/D25368
llvm-svn: 283740
Commit in the name of:Coby Tayree
1.'v' constraint for (x86) non-avx arch imitates the already implemented 'x' constraint, i.e. allows XMM{0-15} & YMM{0-15} depending on the apparent arch & mode (32/64).
2.for the avx512 arch it allows [X,Y,Z]MM{0-31} (mode dependent)
This patch applies the needed changes to clang
clang patch: https://reviews.llvm.org/D25004
Differential Revision: D25005
llvm-svn: 283717
Summary:
Using Python linter flake8 on the utils/lit reveals several linter
warnings designated "F401: Unused import". Fix or silence these
warnings.
Some of these unused imports are legitimate, while some are part of lit's API.
For example, users of lit expect to be able to access `lit.formats.ShTest` in
their `lit.cfg`, despite the module hierarchy for that symbol actually being
`lit.formats.shtest.ShTest`. To silence linter errors for these lines,
include a "noqa" directive.
Reviewers: echristo, delcypher, beanz, ddunbar
Subscribers: mehdi_amini, llvm-commits
Differential Revision: https://reviews.llvm.org/D25407
llvm-svn: 283710
Summary:
`TestingProgressDisplay` initializes its `current` attribute to `None`, but
never reads or writes the value again. Remove it.
Reviewers: echristo, delcypher, beanz, ddunbar
Subscribers: llvm-commits, mehdi_amini
Differential Revision: https://reviews.llvm.org/D25415
llvm-svn: 283709
Summary:
`ArgumentError` is not defined by the Python standard library.
Executing this line of code would throw a exception, but not the
intended one. It would throw a `NameError` exception, since `ArgumentError`
is undefined.
Use `ValueError` instead, which is defined by the Python standard
library.
Reviewers: echristo, delcypher, beanz, ddunbar
Subscribers: llvm-commits, mehdi_amini
Differential Revision: https://reviews.llvm.org/D25410
llvm-svn: 283708
Summary:
Semicolons aren't necessary as statement terminators in Python, and
each of these uses are superfluous as they appear at the end of a line.
The convention is to not use semicolons where not needed, so remove them.
Reviewers: echristo, delcypher, beanz, ddunbar
Subscribers: llvm-commits, mehdi_amini
Differential Revision: https://reviews.llvm.org/D25409
llvm-svn: 283707
Summary: `prefix` is written to but never read.
Reviewers: echristo, delcypher, beanz, ddunbar
Subscribers: llvm-commits, mehdi_amini
Differential Revision: https://reviews.llvm.org/D25408
llvm-svn: 283706
Summary:
The minimum version of Python required to run LLVM's test suite is 2.7.
Remove a workaround for older Python versions.
Reviewers: echristo, delcypher, beanz, ddunbar
Subscribers: llvm-commits, mehdi_amini
Differential Revision: https://reviews.llvm.org/D25400
llvm-svn: 283705
Masked-expand-load node represents load operation that loads a variable amount of elements from memory according to amount of "true" bits in the mask and expands the loaded elements according to their position in the mask vector.
Right now, the node is used in intrinsics for VEXPAND* instructions.
The work is done towards implementation of masked.expandload and masked.compressstore intrinsics.
Differential Revision: https://reviews.llvm.org/D25322
llvm-svn: 283694
The core of the change is supposed to be NFC, however it also fixes
what I believe was an undefined behavior when calling:
va_start(ValueArgs, Desc);
with Desc being a StringRef.
Differential Revision: https://reviews.llvm.org/D25342
llvm-svn: 283671
This seems to have been responsible for the XMM16-31 spills observed in PR29112. With this fixed the test case has been modified to no longer have a spill of XMM16.
llvm-svn: 283668
Summary:
When there is a call to an alias in the same module, we were not
adding a call edge. So we could incorrectly think that the alias
was dead if it was inlined in that function, despite having a
reference imported elsewhere. This resulted in unsats at link time.
Add a call edge when the call is to an alias.
Reviewers: davide, mehdi_amini
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D25384
llvm-svn: 283664
Avoid generating indexed vector instructions for Exynos. This is needed for
fmla/fmls/fmul/fmulx. For example, the instruction
fmla v0.4s, v1.4s, v2.s[1]
is less efficient than the instructions
dup v2.4s, v2.s[1]
fmla v0.4s, v1.4s, v2.4s
Patch written by Abderrazek Zaafrani.
Differential Revision: https://reviews.llvm.org/D21571
llvm-svn: 283663
Value names may be prefixed with a binary '1' to indicate that the
backend should not modify the symbols due to any platform naming
convention.
This should not show up in the YAML opt record file because it breaks
the YAML parser.
llvm-svn: 283656
Clang always emit a hash for ThinLTO, but as other frontend are
starting to use ThinLTO, this could be a serious bug.
Differential Revision: https://reviews.llvm.org/D25379
llvm-svn: 283655
We need to add an entry in the combined-index for modules that have
a hash but otherwise empty summary, this is needed so that we can
get the hash for the module.
Also, if no entry is present in the combined index for a module, we
need to skip it when trying to compute a cache entry.
Differential Revision: https://reviews.llvm.org/D25300
llvm-svn: 283654
This is the first step towards round-tripping symbol information,
and thusly being able to write symbol information to a PDB.
This patch writes the symbol information for each compiland to
the Yaml when running in pdb2yaml mode. There's still some loose
ends, such as what to do about relocations (necessary in order to
print linkage names), how to print enums with friendly names, and
how to give the dumper access to the StringTable, but this is a
good first start.
llvm-svn: 283641
Once MULHS was expanded, this exposed an issue where the condition
register was thought to be 16-bit. This caused an attempt to copy a
16-bit register to an 8-bit register.
Authored by Jake Goulding
llvm-svn: 283634
Because screen space is precious, if an optimization (vectorization, for
example) never happens, don't leave empty space for the associated markers on
every line of the output. This makes the output much more compact, and allows
for the later inclusion of markers for more (although perhaps rare)
optimizations.
llvm-svn: 283626
Summary:
If heap allocation of a coroutine is elided, we need to make sure that we will update an address stored in the coroutine frame from f.destroy to f.cleanup.
Before this change, CoroSplit synthesized these stores after coro.begin:
```
store void (%f.Frame*)* @f.resume, void (%f.Frame*)** %resume.addr
store void (%f.Frame*)* @f.destroy, void (%f.Frame*)** %destroy.addr
```
In those cases where we did heap elision, but were not able to devirtualize all indirect calls, destroy call will attempt to "free" the coroutine frame stored on the stack. Oops.
Now we use select to put an appropriate coroutine subfunction in the destroy slot. As bellow:
```
store void (%f.Frame*)* @f.resume, void (%f.Frame*)** %resume.addr
%0 = select i1 %need.alloc, void (%f.Frame*)* @f.destroy, void (%f.Frame*)* @f.cleanup
store void (%f.Frame*)* %0, void (%f.Frame*)** %destroy.addr
```
Reviewers: majnemer
Subscribers: mehdi_amini, llvm-commits
Differential Revision: https://reviews.llvm.org/D25377
llvm-svn: 283625
The tail duplication pass uses an assumed layout when making duplication
decisions. This is fine, but passes up duplication opportunities that
may arise when blocks are outlined. Because we want the updated CFG to
affect subsequent placement decisions, this change must occur during
placement.
In order to achieve this goal, TailDuplicationPass is split into a
utility class, TailDuplicator, and the pass itself. The pass delegates
nearly everything to the TailDuplicator object, except for looping over
the blocks in a function. This allows the same code to be used for tail
duplication in both places.
This change, in concert with outlining optional branches, allows
triangle shaped code to perform much better, esepecially when the
taken/untaken branches are correlated, as it creates a second spine when
the tests are small enough.
Issue from previous rollback fixed, and a new test was added for that
case as well. Issue was worklist/scheduling/taildup issue in layout.
Issue from 2nd rollback fixed, with 2 additional tests. Issue was
tail merging/loop info/tail-duplication causing issue with loops that share
a header block.
Differential revision: https://reviews.llvm.org/D18226
llvm-svn: 283619
The code used llvm basic block predecessors to decided where to insert phi
nodes. Instruction selection can and will liberally insert new machine basic
block predecessors. There is not a guaranteed one-to-one mapping from pred.
llvm basic blocks and machine basic blocks.
Therefore the current approach does not work as it assumes we can mark
predecessor machine basic block as needing a copy, and needs to know the set of
all predecessor machine basic blocks to decide when to insert phis.
Instead of computing the swifterror vregs as we select instructions, propagate
them at the end of instruction selection when the MBB CFG is complete.
When an instruction needs a swifterror vreg and we don't know the value yet,
generate a new vreg and remember this "upward exposed" use, and reconcile this
at the end of instruction selection.
This will only happen if the target supports promoting swifterror parameters to
registers and the swifterror attribute is used.
rdar://28300923
llvm-svn: 283617
Type visitor code had already been refactored previously to
decouple the visitor and the visitor callback interface. This
was necessary for having the flexibility to visit in different
ways (for example, dumping to yaml, reading from yaml, dumping
to ScopedPrinter, etc).
This patch merely implements the same visitation pattern for
symbol records that has already been implemented for type records.
llvm-svn: 283609
Summary: Add tests for cases where we have zero coverage in RS4GC.
Reviewers: sanjoy, reames
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D25341
llvm-svn: 283591
If we're going to canonicalize IR towards select of constants, try harder to create those.
Also, don't lose the metadata.
This is actually 4 related transforms in one patch:
// select X, (sext X), C --> select X, -1, C
// select X, (zext X), C --> select X, 1, C
// select X, C, (sext X) --> select X, C, 0
// select X, C, (zext X) --> select X, C, 0
Differential Revision: https://reviews.llvm.org/D25126
llvm-svn: 283575
This is a new tool built on top of the new YAML ouput generated from
optimization remarks. It produces HTML for easy navigation and
visualization.
The tool assumes that hotness information for the remarks is available
(the YAML file was produced with PGO). It uses hotness to list the
remarks prioritized by the hotness on the index page. Clicking the
source location of the remark in the list takes you the source where the
remarks are rendedered inline in the source.
For now, the tool is meant as prototype.
It's written in Python. It uses PyYAML to parse the input.
Differential Revision: https://reviews.llvm.org/D25348
llvm-svn: 283571
Summary: -fsample-profile needs discriminator, which will not be added if built with -g0. This patch makes sure the discriminator is added for sample-profile at -g0. A followup patch will be send out to update clang tests.
Reviewers: davidxl, dblaikie, echristo, dnovillo
Subscribers: mehdi_amini, probinson, llvm-commits
Differential Revision: https://reviews.llvm.org/D25132
llvm-svn: 283565
Previously, we marked the branch conditions of latch blocks uniform after
vectorization if they were instructions contained in the loop. However, if a
condition instruction has users other than the branch, it may not remain
uniform. This patch ensures the conditions we mark uniform are only used by the
branch. This should fix PR30627.
Reference: https://llvm.org/bugs/show_bug.cgi?id=30627
llvm-svn: 283563
Summary:
While walking defs of pointer operands we were assuming that the pointer
size would remain constant. This is not true, because addresspacecast
instructions may cast the pointer to an address space with a different
pointer width.
This partial reverts r282612, which was a more conservative solution
to this problem.
Reviewers: reames, sanjoy, apilipenko
Subscribers: wdng, llvm-commits
Differential Revision: https://reviews.llvm.org/D24772
llvm-svn: 283557
Reapplying r283383 after revert in r283442. The additional fix
is a getting rid of a stray space in a function name, in the
refactoring part of the commit.
This avoids falling back to calling out to the GCC rem functions
(__moddi3, __umoddi3) when targeting Windows.
The __rt_div functions have flipped the two arguments compared
to the __aeabi_divmod functions. To match MSVC, we emit a
check for division by zero before actually calling the library
function (even if the library function itself also might do
the same check).
Not all calls to __rt_div functions for division are currently
merged with calls to the same function with the same parameters
for the remainder. This is more wasteful than a div + mls as before,
but avoids calls to __moddi3.
Differential Revision: https://reviews.llvm.org/D25332
llvm-svn: 283550
MOVSD/MOVSS take a 128-bit register and a FR32/FR64 register input, the commutation code wasn't taking this into account leading to verification errors.
This patch inserts a vreg copy mi to ensure that the registers are correct.
Fix for PR30607
Differential Revision: https://reviews.llvm.org/D25280
llvm-svn: 283539
unrolling.
The next code is not vectorized by the SLPVectorizer:
```
int test(unsigned int *p) {
int sum = 0;
for (int i = 0; i < 8; i++)
sum += p[i];
return sum;
}
```
During optimization this loop is fully unrolled and SLPVectorizer is
unable to vectorize it. Patch tries to fix this problem.
Differential Revision: https://reviews.llvm.org/D24796
llvm-svn: 283535
With the ROPI and RWPI relocation models we can't always have pointers
to global data or functions in constant data, so don't try to convert switches
into lookup tables if any value in the lookup table would require a relocation.
We can still safely emit lookup tables of other values, such as simple
constants.
Differential Revision: https://reviews.llvm.org/D24462
llvm-svn: 283530
Summary:
There was a bug with sequences like
s_mov_b64 s[0:1], exec
s_and_b64 s[2:3]<def>, s[0:1], s[2:3]<kill>
...
s_mov_b64_term exec, s[2:3]
because s[2:3] was defined and used in the same instruction, ending up with
SaveExecInst inside OtherUseInsts.
Note that the test case also exposes an unrelated bug.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98028
Reviewers: tstellarAMD, arsenm
Subscribers: kzhuravl, wdng, yaxunl, llvm-commits, tony-tye
Differential Revision: https://reviews.llvm.org/D25306
llvm-svn: 283528
Summary:
This class deals with the lowering of CodeGen `MachineInstr` objects to
MC `MCInst` objects.
Reviewers: kparzysz, arsenm
Subscribers: wdng, beanz, japaric, mgorny
Differential Revision: https://reviews.llvm.org/D25269
llvm-svn: 283522
In the left part of the reports, we have things like U<number>; if some of
these numbers use more digits than others, we don't want a space in between the
U and the start of the number. Instead, the space should come afterward. This
way it is clear that the number goes with the U and not any other optimization
indicator that might come later on the line.
Tests committed in r283518.
llvm-svn: 283519
In the left part of the reports, we have things like U<number>; if some of
these numbers use more digits than others, we don't want a space in between the
U and the start of the number. Instead, the space should come afterward. This
way it is clear that the number goes with the U and not any other optimization
indicator that might come later on the line.
llvm-svn: 283518
GetCaseResults assumed that a terminator with one successor was an
unconditional branch. This is not necessarily the case, it could be a
cleanupret.
Strengthen the check by querying whether or not the terminator is
exceptional.
llvm-svn: 283517
Vectorizer tests in the target-independent directory should not have a target
triple. If a test really needs to query a specific backend, it belongs in the
right target subdirectory (which "REQUIRES" the right backend). Otherwise, it
should not specify a triple.
llvm-svn: 283512
Summary:
I had for the second time today a bug where llvm::format("%s", Str)
was called with Str being a StringRef. The Linux and MacOS bots were
fine, but windows having different calling convention, it printed
garbage.
Instead we can catch this at compile-time: it is never expected to
call a C vararg printf-like function with non scalar type I believe.
Reviewers: bogner, Bigcheese, dexonsmith
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D25266
llvm-svn: 283509
Per spec changes, this implements block signatures, and adds just enough
logic to produce correct block signatures at the ends of functions.
Differential Revision: https://reviews.llvm.org/D25144
llvm-svn: 283503
Per spec changes, store instructions in WebAssembly no longer have a return
value. Update the instruction descriptions.
Differential Revision: https://reviews.llvm.org/D25122
llvm-svn: 283501
Summary:
These nodes need legalization for 3-element vectors. This commit
handles the legalization and adds tests for zext and sext.
This fixes PR30614.
Reviewers: RKSimon, srhines
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D25268
llvm-svn: 283496
Add a weak alias to the renamed Comdat function in IR level instrumentation,
using it's original name. This ensures the same behavior w/ and w/o IR
instrumentation, even for non standard conforming code.
Differential Revision: http://reviews.llvm.org/D25339
llvm-svn: 283490
When replacing FrameIndex with BasePtr, we must preserve BasePtr for
LEA64_32r since BasePtr is used later for stack adjustment if it is
the same as StackPtr.
Patch by H.J Lu <hjl.tools@gmail.com>
Differential Revision: https://reviews.llvm.org/D23575
llvm-svn: 283486
This generalizes the build_vector -> vector_shuffle combine to support any
number of inputs. The idea is to create a binary tree of shuffles, where
the first layer performs pairwise shuffles of the input vectors placing each
input element into the correct lane, and the rest of the tree blends these
shuffles together.
This doesn't try to be smart and create any sort of "optimal" shuffles.
The assumption is that even a "poor" shuffle sequence is better than extracting
and inserting the elements one by one.
Differential Revision: https://reviews.llvm.org/D24683
llvm-svn: 283480
This adds a new function to DebugInfo.cpp that takes an llvm::Module
as input and removes all debug info metadata that is not directly
needed for line tables, thus effectively stripping all type and
variable information from the module.
The primary motivation for this feature was the bitcode work flow
(cf. http://lists.llvm.org/pipermail/llvm-dev/2016-June/100643.html
for more background). This is not wired up yet, but will be in
subsequent patches. For testing, the new functionality is exposed to
opt with a -strip-nonlinetable-debuginfo option.
The secondary use-case (and one that works right now!) is as a
reduction pass in bugpoint. I added two new bugpoint options
(-disable-strip-debuginfo and -disable-strip-debug-types) to control
the new features. By default it will first attempt to remove all debug
information, then only the type info, and then proceed to hack at any
remaining MDNodes.
llvm-svn: 283473
When constant folding an operation to a copy or an immediate
mov, the implicit uses/defs of the old instruction were left behind,
e.g. replacing v_or_b32 left the implicit exec use on the new copy.
llvm-svn: 283471
Summary:
The acronym PR could be ambiguous to some users, especially those who
are used to interpreting it as GitHub's "pull request".
Reviewers: ddunbar, jordan_rose, void, beanz
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D25331
llvm-svn: 283465
Change erroneous parsing of push immediate instructions in intel syntax
to default to pointer size by rewriting into the ATT style for matching.
This fixes PR22028.
Reviewers: majnemer, rnk
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D25288
llvm-svn: 283457
When there are multiple optimizations on one line, record the vectorization
factors, etc. correctly (instead of incorrectly substituting default values).
llvm-svn: 283443
This reverts commit r283383 because it broke some of the bots:
undefined reference to ` __aeabi_uldivmod'
It affected (at least) clang-cmake-armv7-a15-selfhost,
clang-cmake-armv7-a15-selfhost and clang-native-arm-lnt.
llvm-svn: 283442
These ones need to have the size on the pseudo instruction set for
getInstSizeInBytes to work correctly. These also have a statically
known size.
llvm-svn: 283437
Summary:
The computeKnownBits and ComputeNumSignBits functions in ValueTracking can now do a simple look-through of ExtractElement.
Reviewers: majnemer, spatel
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D24955
llvm-svn: 283434
Global variables are GlobalValues, so they have explicit alignment. Querying
DataLayout for the alignment was incorrect.
Testcase added.
llvm-svn: 283423
We can work around a shortcoming of FileCheck by using {{\[}} to match a square
bracket before a [[ sequence.
Thanks to Eli Friedman for the heads up!
llvm-svn: 283422
If we don't truncate, LLVM asserts when the label difference doesn't fit
in a 16 bit field. This patch truncates two kinds of data: trailing null
terminated names in symbol records, and inline line tables. The inline
line table test that I have is too large (many MB), so I'm not checking
it in.
Hopefully fixes PR28264.
llvm-svn: 283403