This processor feature had been left out by mistake from the z13
ProcessorModel.
This time with updated test case. Thanks, Hans.
Reviewed by Ulrich Weigand.
llvm-svn: 274216
CodeView need to know the offset of the storage allocation for a
bitfield. Encode this via the "extraData" field in DIDerivedType and
introduced a new flag, DIFlagBitField, to indicate whether or not a
member is a bitfield.
This fixes PR28162.
Differential Revision: http://reviews.llvm.org/D21782
llvm-svn: 274200
re-insertion of entries into the worklist moves them to the end.
This is fairly similar to a SetVector, but helps in the case where in
addition to not inserting duplicates you want to adjust the sequence of
a pop-off-the-back worklist.
I'm not at all attached to the name of this data structure if others
have better suggestions, but this is one that David Majnemer brought up
in IRC discussions that seems plausible.
I've trimmed the interface down somewhat from SetVector's interface
because several things make less sense here IMO: iteration primarily.
I'd prefer to add these back as we have users that need them. My use
case doesn't even need all of what is provided here. =]
I've also included a basic unittest to make sure this functions
reasonably.
Differential Revision: http://reviews.llvm.org/D21866
llvm-svn: 274198
This patch makes CFLAA answer some ModRef queries. Because we don't
distinguish between reading/writing when making StratifiedSets, we're
unable to offer any of the readonly-related answers.
Patch by Jia Chen.
Differential Revision: http://reviews.llvm.org/D21858
llvm-svn: 274197
On Darwin it is currently impossible to build LLVM with modules
because the Darwin system module map is not compatible with
-fmodules-local-submodule-visibility at this point in time. This
patch makes the flag optional and off by default on Darwin so it
becomes possible to build LLVM with modules again.
http://reviews.llvm.org/D21827
rdar://problem/27019000
llvm-svn: 274196
- Use range based for loops
- No need for some !Reg checks: isPhysicalRegister() reports false for
NoRegister anyway
- Do not repeat function name in documentation comment.
- Do not repeat documentation comment in implementation when we already
have one at the declaration.
- Factor some common subexpressions out.
- Change file comments to use doxygen syntax.
llvm-svn: 274194
Add an explicit overload to BuildMI for MachineInstr& to deal with
insertions inside of instruction bundles.
- Use it to re-implement MachineInstr* to give it coverage.
- Document how the overload for MachineBasicBlock::instr_iterator
differs from that for MachineBasicBlock::iterator (the previous
(implicit) overload for MachineInstr&).
- Add a comment explaining why the MachineInstr& and MachineInstr*
overloads don't universally forward to the
MachineBasicBlock::instr_iterator overload.
Thanks to Justin for noticing the API quirk. While this doesn't fix any
known bugs -- all uses of BuildMI with a MachineInstr& were previously
using MachineBasicBlock::iterator -- it protects against future bugs.
llvm-svn: 274193
This is mostly a mechanical change to make TargetInstrInfo API take
MachineInstr& (instead of MachineInstr* or MachineBasicBlock::iterator)
when the argument is expected to be a valid MachineInstr. This is a
general API improvement.
Although it would be possible to do this one function at a time, that
would demand a quadratic amount of churn since many of these functions
call each other. Instead I've done everything as a block and just
updated what was necessary.
This is mostly mechanical fixes: adding and removing `*` and `&`
operators. The only non-mechanical change is to split
ARMBaseInstrInfo::getOperandLatencyImpl out from
ARMBaseInstrInfo::getOperandLatency. Previously, the latter took a
`MachineInstr*` which it updated to the instruction bundle leader; now,
the latter calls the former either with the same `MachineInstr&` or the
bundle leader.
As a side effect, this removes a bunch of MachineInstr* to
MachineBasicBlock::iterator implicit conversions, a necessary step
toward fixing PR26753.
Note: I updated WebAssembly, Lanai, and AVR (despite being
off-by-default) since it turned out to be easy. I couldn't run tests
for AVR since llc doesn't link with it turned on.
llvm-svn: 274189
- Use range based for
- Use the more common variable names MBB and MF for
MachineBasicBlock/MachineFunction variables.
- Add a few const modifiers
llvm-svn: 274187
The NewArchiveIterator class has a problem: it requires too much context. Any
memory buffers added to the archive must be stored within an Archive::Member,
which must have an associated Archive. This makes it harder than necessary
to create new archive members (or new archives entirely) from scratch using
memory buffers.
This patch replaces NewArchiveIterator with a NewArchiveMember class that
stores just the memory buffer and the information that goes into the archive
member header.
Differential Revision: http://reviews.llvm.org/D21721
llvm-svn: 274183
This fixes an issue where occurrence counts would be unexpectedly
reset when parsing different parts of a command line multiple
times.
**ORIGINAL COMMIT MESSAGE**
This allows command line tools to use syntaxes like the following:
llvm-foo.exe command1 -o1 -o2
llvm-foo.exe command2 -p1 -p2
Where command1 and command2 contain completely different sets of
valid options. This is backwards compatible with previous uses
of llvm cl which did not support subcommands, as any option
which specifies no optional subcommand (e.g. all existing
code) goes into a special "top level" subcommand that expects
dashed options to appear immediately after the program name.
For example, code which is subcommand unaware would generate
a command line such as the following, where no subcommand
is specified:
llvm-foo.exe -q1 -q2
The top level subcommand can co-exist with actual subcommands,
as it is implemented as an actual subcommand which is searched
if no explicit subcommand is specified. So llvm-foo.exe as
specified above could be written so as to support all three
aforementioned command lines simultaneously.
There is one additional "special" subcommand called AllSubCommands,
which can be used to inject an option into every subcommand.
This is useful to support things like help, so that commands
such as:
llvm-foo.exe --help
llvm-foo.exe command1 --help
llvm-foo.exe command2 --help
All work and display the help for the selected subcommand
without having to explicitly go and write code to handle each
one separately.
This patch is submitted without an example of anything actually
using subcommands, but a followup patch will convert the
llvm-pdbdump tool to use subcommands.
Reviewed By: beanz
llvm-svn: 274171
This is a fix for PR27842.
An IR-level implementation of stack coloring tailored to work with
SafeStack. It is a bit weaker than the MI implementation in that it
does not the "lifetime start at first access" logic. This can be
improved in the future.
This patch also replaces the naive implementation of stack frame
layout with a greedy algorithm that can split existing stack slots
and even fit small objects inside the alignment padding of other
objects.
llvm-svn: 274162
[x86] (PR15455) While (ins|outs)[bwld] instructions do not take %dx as a
memory operand, various unofficial references do and objdump
disassembles to this format. Extend special treatment of
similar (in|out)[bwld] operations.
Reviewers: craig.topper, rnk, ab
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D18837
llvm-svn: 274152
This fixes pr28072.
The point, as Duncan pointed out, is that the file is already
partially linked by just reading it.
Long term I think the solution is to make metadata owned by the module
and then the linker will lazily read it and be in charge of all the
linking. Running a verifier in each input will defeat the lazy
loading, but will be legal.
Right now we are at the unfortunate position that to support odr
merging we cannot verify the inputs, which mildly annoying (see test
update).
llvm-svn: 274148
Summary: This adds tests for covering all cases that FoldAndOfFCmps and FoldOrOfFCmps handle.
Reviewers: spatel
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D21844
llvm-svn: 274144
I'm planning on extending these two tests with checks that validate
html coverage reports. Make it easier to extend them by not using a
prefix called "CHECK".
llvm-svn: 274143
When lowering two blended PACKUS, we used to disregard the types
of the PACKUS inputs, indiscriminately generating a v16i8 PACKUS.
This leads to non-selectable things like:
(v16i8 (PACKUS (v4i32 v0), (v4i32 v1)))
Instead, check that the PACKUSes have the same type, and use that
as the final result type.
llvm-svn: 274138
In -output-dir mode, file reports are placed into a "coverage"
directory. If filenames in the coverage mapping contain "..", they might
escape out of this directory.
Fix the problem by removing ".." from source filenames (expand the path
component).
llvm-svn: 274135
This gets rid of the memory fence in the hot path (dereferencing the
ManagedStatic), trading for an extra mutex lock in the cold path (when
the ManagedStatic was uninitialized). Since this only happens on the
first accesses it shouldn't matter much. On strict architectures like
x86 this removes any atomic instructions from the hot path.
Also remove the tsan annotations, tsan knows how standard atomics work
so they should be unnecessary now.
llvm-svn: 274131
This reverts commit 520a8298d8ef676b5da617ba3d2c7fa37381e939 (r273055).
This is breaking stage2 instrumented builds with "malformed coverage
data" errors.
llvm-svn: 274106
This is breaking an optimizaton remark test in clang. I've identified a couple fixes for that, but want to understand it better before I commit to anything.
llvm-svn: 274102
For the new hotness attribute, the API will take the pass rather than
the pass name so we can no longer play the trick of AlwaysPrint being a
special pass name. This adds a getter to help the transition.
There is also a corresponding clang patch.
llvm-svn: 274100
If a operation for a recurrence is an addition with no signed wrap and both input sign bits are 0, then the result sign bit must also be 0. Similar for the negative case.
I found this deficiency while playing around with a loop in the x86 backend that contained a signed division that could be optimized into an unsigned division if we could prove both inputs were positive. One of them being the loop induction variable. With this patch we can perform the conversion for this case. One of the test cases here is a contrived variation of the loop I was looking at.
Differential revision: http://reviews.llvm.org/D21493
llvm-svn: 274098
Revert "[InstCombine] Combine A->B->A BitCast"
as this appears to cause PR27996 and as discussed in http://reviews.llvm.org/D20847
This reverts commits r270135 and r263734.
llvm-svn: 274094
This allows us to query about the endianness without having to
look at DataLayout. The API will be used (and tested) in lld,
in order to find out the endianness of BitcodeFiles.
Briefly discussed with Rafael.
llvm-svn: 274090
- Add renderView{Header,Footer}, renderLineSuffix, and hasSubViews to
support creating tables with nested views.
- Move the 'Format' cl::opt to make it easier to extend.
- Just create one function view file, instead of overwriting the same
file for every new function. Add a regression test for this.
llvm-svn: 274086
and its clients to use the new llvm::Error model for error handling.
Changed getAsArchive() from ErrorOr<...> to Expected<...> so now all
interfaces there use the new llvm::Error model for return values.
In the two places it had if (!Parent) this is actually a program error so changed
from returning errorCodeToError(object_error::parse_failed) to calling
report_fatal_error() with a message.
In getObjectForArch() added error messages to its two llvm::Error return values
instead of returning errorCodeToError(object_error::arch_not_found) with no
error message.
For the llvm-obdump, llvm-nm and llvm-size clients since the only binary files in
Mach-O Universal Binaries that are supported are Mach-O files or archives with
Mach-O objects, updated their logic to generate an error when a slice contains
something like an ELF binary instead of ignoring it. And added a test case for
that.
The last error stuff to be cleaned up for libObject’s MachOUniversalBinary is
the use of errorOrToExpected(Archive::create(ObjBuffer)) which needs
Archive::create() to be changed from ErrorOr<...> to Expected<...> first,
which I’ll work on next.
llvm-svn: 274079
Summary:
This fixes bug: https://llvm.org/bugs/show_bug.cgi?id=28282
Currently the cost model of constant hoisting checks the bit width of the data type of the constants.
However, the actual immediate value is small enough and not need to be hoisted.
This patch checks for the actual bit width needed for the constant.
Reviewers: t.p.northover, rengolin
Subscribers: aemerson, rengolin, llvm-commits
Differential Revision: http://reviews.llvm.org/D21668
llvm-svn: 274073
Summary: LLVM assumes that large clearance will hide the partial register spill penalty. But in our experiment, 16 clearance is too small. As the inserted XOR is normally fairly cheap, we should have a higher clearance threshold to aggressively insert XORs that is necessary to break partial register dependency.
Reviewers: wmi, davidxl, stoklund, zansari, myatsina, RKSimon, DavidKreitzer, mkuper, joerg, spatel
Subscribers: davidxl, llvm-commits
Differential Revision: http://reviews.llvm.org/D21560
llvm-svn: 274068
Our existing yaml::Output code writes tags immediately when mapTag is called, without any state handling. This results in tags on sequence elements being written before the element itself. For example, we see this:
SomeArray: !elem_type
- key1: 1
key2: 2 !elem_type2
- key3: 3
key4: 4
We should instead see:
SomeArray:
- !elem_type
key1: 1
key2: 2
- !elem_type2
key3: 3
key4: 4
Our reader handles reading properly, so this bug only impacts writing yaml sequences with tagged elements.
As a test for this I've modified the Mach-O yaml encoding to allways apply the !mach-o tag when encoding MachOYAML::Object entries. This results in the !mach-o tag appearing as expected in dumped fat files.
llvm-svn: 274067
Summary: SystemZ shift instructions only use the last 6 bits of the shift
amount. When the result of an AND operation is used as a shift amount, this
means that we can use the NILL instruction (which operates on the last 16 bits)
rather than NILF (which operates on the last 32 bits) for a 16-bit savings in
instruction size.
Reviewers: uweigand
Subscribers: llvm-commits
Author: colpell
Committing on behalf of Elliot.
Differential Revision: http://reviews.llvm.org/D21686
llvm-svn: 274066
I think this converts all the simple cases that really just care about
the generated code being position independent or not. The remaining
uses are a bit more complicated and are checking things like "is this
a library or executable" or "can this symbol be preempted".
llvm-svn: 274055
This allows command line tools to use syntaxes like the following:
llvm-foo.exe command1 -o1 -o2
llvm-foo.exe command2 -p1 -p2
Where command1 and command2 contain completely different sets of
valid options. This is backwards compatible with previous uses
of llvm cl which did not support subcommands, as any option
which specifies no optional subcommand (e.g. all existing
code) goes into a special "top level" subcommand that expects
dashed options to appear immediately after the program name.
For example, code which is subcommand unaware would generate
a command line such as the following, where no subcommand
is specified:
llvm-foo.exe -q1 -q2
The top level subcommand can co-exist with actual subcommands,
as it is implemented as an actual subcommand which is searched
if no explicit subcommand is specified. So llvm-foo.exe as
specified above could be written so as to support all three
aforementioned command lines simultaneously.
There is one additional "special" subcommand called AllSubCommands,
which can be used to inject an option into every subcommand.
This is useful to support things like help, so that commands
such as:
llvm-foo.exe --help
llvm-foo.exe command1 --help
llvm-foo.exe command2 --help
All work and display the help for the selected subcommand
without having to explicitly go and write code to handle each
one separately.
This patch is submitted without an example of anything actually
using subcommands, but a followup patch will convert the
llvm-pdbdump tool to use subcommands.
Reviewed By: beanz
Differential Revision: http://reviews.llvm.org/D21485
llvm-svn: 274054
This is a resubmittion of 263158 change after fixing the existing problem with intrinsics mangling (see LTO and intrinsics mangling llvm-dev thread for details).
This patch fixes the problem which occurs when loop-vectorize tries to use @llvm.masked.load/store intrinsic for a non-default addrspace pointer. It fails with "Calling a function with a bad signature!" assertion in CallInst constructor because it tries to pass a non-default addrspace pointer to the pointer argument which has default addrspace.
The fix is to add pointer type as another overloaded type to @llvm.masked.load/store intrinsics.
Reviewed By: reames
Differential Revision: http://reviews.llvm.org/D17270
llvm-svn: 274043
This index lists the reports available in the 'coverage' sub-directory.
This will help navigate coverage output from large projects.
This commit factors the file creation code out of SourceCoverageView and
into CoveragePrinter.
llvm-svn: 274029
The original implementation attempted to zero registers using
XOR %foo, %foo. This is problematic because it constitutes a
read-modify-write of a register which might not be defined.
Instead, use MOV32r0 to avoid these problems; expandPostRAPseudo does
the right thing here.
llvm-svn: 274024
AVX1 can only broadcast vectors as floats/doubles, so for 256-bit vectors we insert bitcasts if we are shuffling v8i32/v4i64 types. Unfortunately the presence of these bitcasts prevents the current broadcast lowering code from peeking through cases where we have concatenated / extracted vectors to create the 256-bit vectors.
This patch allows us to peek through bitcasts as long as the number of elements doesn't change (i.e. element bitwidth is the same) so the broadcast index is not affected.
Note this bitcast peek is different from the stage later on which doesn't care about the type and is just trying to find a load node.
As we're being more aggressive with bitcasts, we also need to ensure that the broadcast type is correctly bitcasted
Differential Revision: http://reviews.llvm.org/D21660
llvm-svn: 274013
Some headers in IR depend on tablegen generated code. Modules builds triggered
generation of the LLVM_IR module (including headers dependant on intrinsic_gen),
imposing a unnecessary build dependency.
Reviewed by Richard Smith.
llvm-svn: 274006
This patch allows target shuffles to be combined to single input immediate permute instructions - (V)PSHUFD/VPERMILPD/VPERMILPS - allowing more general pattern matching than what we current do and improves the likelihood of memory folding compared to existing patterns which tend to reuse the input in multiple arguments.
Further permute instructions (V)PSHUFLW/(V)PSHUFHW/(V)PERMQ/(V)PERMPD may be added in the future but its proven tricky to create tests cases for them so far. (V)PSHUFLW/(V)PSHUFHW is already handled quite well in combineTargetShuffle so it may be that removing some of that code may allow us to perform more of the combining in one place without duplication.
Differential Revision: http://reviews.llvm.org/D21148
llvm-svn: 273999
This patch enhances dot graph viewer to show hot regions
with hot bbs/edges displayed in red. The ratio of the bb
freq to the max freq of the function needs to be no less
than the value specified by view-hot-freq-percent option.
The default value is 10 (i.e. 10%).
llvm-svn: 273996
MBFI supports profile count dumping and function
name based filtering. Add these two feature to
BFI as well. The filtering option is shared between
BFI and MBFI: -view-bfi-func-name=..
llvm-svn: 273992
If the load is conditional we can't hoist its 0-iteration instance to
the preheader because that would make it unconditional. Thus we would
access a memory location that the original loop did not access.
llvm-svn: 273991
BFI and MBFI's dot traits class share most of the
code and all future enhancement. This patch extracts
common implementation into base class BFIDOTGraphTraitsBase.
This patch also enables BFI graph to show branch probability
on edges as MBFI does before.
llvm-svn: 273990
Passing -output-dir path/to/dir to llvm-cov show creates path/to/dir if
it doesn't already exist, and prints reports into that directory.
In function view mode, all views are written into
path/to/dir/functions.$EXTENSION. In file view mode, all views are
written into path/to/dir/coverage/$PATH.$EXTENSION.
Changes since the initial commit:
- Avoid accidentally closing stdout twice.
llvm-svn: 273985
This reverts commit r273971. test/profile/instrprof-visibility.cpp is
failing because of an uncaught error in SafelyCloseFileDescriptor.
llvm-svn: 273978
Passing -output-dir path/to/dir to llvm-cov show creates path/to/dir if
it doesn't already exist, and prints reports into that directory.
In function view mode, all views are written into
path/to/dir/functions.$EXTENSION. In file view mode, all views are
written into path/to/dir/coverage/$PATH.$EXTENSION.
llvm-svn: 273971
the new pass manager.
This adds operator<< overloads for the various bits of the
LazyCallGraph, dump methods for use from the debugger, and debug logging
using them to the CGSCC pass manager.
Having this was essential for debugging the call graph update patch, and
I've extracted what I could from that patch here to minimize the delta.
llvm-svn: 273961
Apparently, MSVC complains if there's an implicit conversion from
`unsigned` to `unsigned long long`, if the `unsigned` is the result of
a bit shift.
llvm-svn: 273955
allow a good error message to be produced.
I added the one test case that the object file tools could produce an error
message. The other two errors can’t be triggered if the input file is passed
through sys::fs::identify_magic(). But the malformedError("bad magic number")
does get triggered by the logic in llvm-dsymutil when dealing with a normal
Mach-O file. The other "File too small ..." error would take a logic error
currently to produce and is not tested for.
llvm-svn: 273946
binutils ar uses -plugin to specify the LTO plugin, but LLVM doesn't
need this as it doesn't use a plugin for LTO. Accepting (and ignoring)
the option allows interoperability with existing build systems and
make downstream consumers life much easier.
No objections from Rafael on this change.
llvm-svn: 273938
Summary:
Our YAML library's handling of tags isn't perfect, but it is good enough to get rid of the need for the --format argument to yaml2obj. This patch does exactly that.
Instead of requiring --format, it infers the format based on the tags found in the object file. The supported tags are:
!ELF
!COFF
!mach-o
!fat-mach-o
I have a corresponding patch that is quite large that fixes up all the in-tree test cases.
Reviewers: rafael, Bigcheese, compnerd, silvas
Subscribers: compnerd, llvm-commits
Differential Revision: http://reviews.llvm.org/D21711
llvm-svn: 273915
This is a resubmittion of 263158 change after fixing the existing problem with intrinsics mangling (see LTO and intrinsics mangling llvm-dev thread for details).
This patch fixes the problem which occurs when loop-vectorize tries to use @llvm.masked.load/store intrinsic for a non-default addrspace pointer. It fails with "Calling a function with a bad signature!" assertion in CallInst constructor because it tries to pass a non-default addrspace pointer to the pointer argument which has default addrspace.
The fix is to add pointer type as another overloaded type to @llvm.masked.load/store intrinsics.
Reviewed By: reames
Differential Revision: http://reviews.llvm.org/D17270
llvm-svn: 273892
See the bug report at https://github.com/google/sanitizers/issues/691. When a dynamic alloca has a constant size, ASan instrumentation will treat it as a regular dynamic alloca (insert calls to poison and unpoison), but the backend will turn it into a regular stack variable. The poisoning/unpoisoning is then broken. This patch will treat such allocas as static.
Differential Revision: http://reviews.llvm.org/D21509
llvm-svn: 273888
Summary:
Created a pattern to match 64-bit mode (and (xor x, -1), y)
to a shorter sequence of instructions.
Before the change, the canonical form is translated to:
xihf %r3, 4294967295
xilf %r3, 4294967295
ngr %r2, %r3
After the change, the canonical form is translated to:
ngr %r3, %r2
xgr %r2, %r3
Reviewers: zhanjunl, uweigand
Subscribers: llvm-commits
Author: assem
Committing on behalf of Assem.
Differential Revision: http://reviews.llvm.org/D21693
llvm-svn: 273887
The main issue here is that the "thumb" flag wasn't set for some of these
sections, making MSVC's link.exe fails to correctly relocate code
against the symbols inside these sections. link.exe could fail for
instance with the "fixup is not aligned for target 'XX'" error. If
linking doesn't fail, the relocation process goes wrong in the end and
invalid code is generated by the linker.
This patch adds Thumb/ARM information so that the right flags are set
on COFF/Windows.
Patch by Adrien Guinet.
llvm-svn: 273880
It did not handle correctly cases without GEP.
The following loop wasn't vectorized:
for (int i=0; i<len; i++)
*to++ = *from++;
I use getPtrStride() to find Stride for memory access and return 0 is the Stride is not 1 or -1.
Re-commit rL273257 - revision: http://reviews.llvm.org/D20789
llvm-svn: 273864
This is a follow-up for r273544.
The end goal is to get rid of the isSwift / isCortexXY / isWhatever methods.
Since the ARM backend seems to have quite a lot of calls to these methods, I
intend to submit 5-6 subtarget features at a time, instead of one big lump.
Differential Revision: http://reviews.llvm.org/D21685
llvm-svn: 273853
Summary: Actually the list of cached files is sorted by file size, not by last accessed time. Also remove unused file access time param for a helper function.
Reviewers: joker-eph, chandlerc, davide
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D21639
llvm-svn: 273852
AVX1 can only broadcast vectors as floats/doubles, so for 256-bit vectors we insert bitcasts if we are shuffling v8i32/v4i64 types. Unfortunately the presence of these bitcasts prevents the current broadcast lowering code from peeking through cases where we have concatenated / extracted vectors to create the 256-bit vectors.
This patch allows us to peek through bitcasts as long as the number of elements doesn't change (i.e. element bitwidth is the same) so the broadcast index is not affected.
Note this bitcast peek is different from the stage later on which doesn't care about the type and is just trying to find a load node.
Differential Revision: http://reviews.llvm.org/D21660
llvm-svn: 273848
This variable is used by ASan (and other sanitizers in the future)
on s390x-linux to override a check for CVE-2016-2143 in the running
kernel (see revision 267747 on compiler-rt). Since the check simply
checks if the kernel version is in a whitelist of known-good versions,
it may miss distribution kernels, or manually-patched kernels - hence
the need for this variable. To enable running the ASan testsuite on
such kernels, this variable should be passed from the environment
down to the testcases.
Differential Revision: http://reviews.llvm.org/D19888
llvm-svn: 273825
Summary:
This is a straightforward extension of what LoopUnswitch does to
branches to guards. That is, we unswitch
```
for (;;) {
...
guard(loop_invariant_cond);
...
}
```
into
```
if (loop_invariant_cond) {
for (;;) {
...
// There is no need to emit guard(true)
...
}
} else {
for (;;) {
...
guard(false);
// SimplifyCFG will clean this up by adding an
// unreachable after the guard(false)
...
}
}
```
Reviewers: majnemer
Subscribers: mcrosier, llvm-commits, mzolotukhin
Differential Revision: http://reviews.llvm.org/D21725
llvm-svn: 273801
The last import is the penultimate entry, the last entry is nulled out.
Data beyond the null entry should not be considered to hold import
entries.
This fixes PR28302.
N.B. I am working on a reduced testcase, the one in PR28302 is too
large.
llvm-svn: 273790
If a sub-view has already been rendered, it's helpful to re-render the
expansion site before rendering the next expansion view. Make this fact
explicit in the rendering interface, instead of hiding it behind an
awkward Optional<LineRef> parameter.
llvm-svn: 273789
Don't cast GV expression to MCSymbolRefExpr. r272705 changed GV to binary
expressions by including offset even if the offset it 0
(we haven't hit this sooner since tested workloads don't include static offsets)
We don't really care about the type of expression, so set it directly.
Fixes: r272705
Consider section relative relocations. Since all const as data is in one boffer section relative is equivalent to abs32.
Fixes: r273166
Differential Revision: http://reviews.llvm.org/D21633
llvm-svn: 273785
This is just a small step in the direction of making LLVMConfig.cmake a complete
replacement for llvm-config.
For those unfamiliar, llvm-config --build-mode prints out CMAKE_BUILD_TYPE. Thus
as one can imagine, LLVM_BUILD_TYPE is @CMAKE_BUILD_TYPE@.
llvm-svn: 273782
SimplifyCFG had logic to insert calls to llvm.trap for two very
particular IR patterns: stores and invokes of undef/null.
While InstCombine canonicalizes certain undefined behavior IR patterns
to stores of undef, phase ordering means that this cannot be relied upon
in general.
There are much better tools than llvm.trap: UBSan and ASan.
N.B. I could be argued into reverting this change if a clear argument as
to why it is important that we synthesize llvm.trap for stores, I'd be
hard pressed to see why it'd be useful for invokes...
llvm-svn: 273778
Debugger prologue is emitted if -mattr=+amdgpu-debugger-emit-prologue.
Debugger prologue writes work group IDs and work item IDs to scratch memory at fixed location in the following format:
- offset 0: work group ID x
- offset 4: work group ID y
- offset 8: work group ID z
- offset 16: work item ID x
- offset 20: work item ID y
- offset 24: work item ID z
Set
- amd_kernel_code_t::debug_wavefront_private_segment_offset_sgpr to scratch wave offset reg
- amd_kernel_code_t::debug_private_segment_buffer_sgpr to scratch rsrc reg
- amd_kernel_code_t::is_debug_supported to true if all debugger features are enabled
Differential Revision: http://reviews.llvm.org/D20335
llvm-svn: 273769
This adds some tests for the smarter llvm-ar selection mode as well as some
additional tests as per Rafael's post commit review comments.
llvm-svn: 273768
This makes it easier to add renderers for new kinds of output formats.
- Define and document a pure-virtual coverage rendering interface.
- Move the text-based rendering logic into its a new file.
- Re-work the API to better reflect the presentation/formatting split.
llvm-svn: 273767
Remember the last choice for the top/bottom scheduling boundary in
bidirectional scheduling mode. The top choice should not change if we
schedule at the bottom and vice versa.
This allows us to improve compiletime: We only recalculate the best pick
for one border and re-use the cached top-pick from the other border.
Differential Revision: http://reviews.llvm.org/D19350
llvm-svn: 273766
Summary:
Offset folding only works if you are emitting relocations, and we don't
emit relocations for local address space globals.
Reviewers: arsenm, nhaustov
Subscribers: arsenm, llvm-commits, kzhuravl
Differential Revision: http://reviews.llvm.org/D21647
llvm-svn: 273765
There are two separate issues:
- LLVM doesn't consider infinite loops to be side effects: we happily
hoist/sink above/below loops whose bounds are unknown.
- The absence of the noreturn attribute is insufficient for us to know
if a function will definitely return. Relying on noreturn in the
middle-end for any property is an accident waiting to happen.
llvm-svn: 273762
This intrinsic safely loads a function pointer from a virtual table pointer
using type metadata. This intrinsic is used to implement control flow integrity
in conjunction with virtual call optimization. The virtual call optimization
pass will optimize away llvm.type.checked.load intrinsics associated with
devirtualized calls, thereby removing the type check in cases where it is
not needed to enforce the control flow integrity constraint.
This patch also introduces the capability to copy type metadata between
global variables, and teaches the virtual call optimization pass to do so.
Differential Revision: http://reviews.llvm.org/D21121
llvm-svn: 273756
In bidirectional scheduling this gives more stable results than just
comparing the "reason" fields of the top/bottom node because the reason
field may be higher depending on what other nodes are in the queue.
Differential Revision: http://reviews.llvm.org/D19401
llvm-svn: 273755
r273711 was reverted by r273743. The inliner needs to know about any
call sites in the inlined function. These were obscured if we replaced
a call to undef with an undef but kept the call around.
This fixes PR28298.
llvm-svn: 273753
COPY was lacking a scheduling class, define it to avoid regressions in
the upcoming change to the bidirectional MachineScheduler. Approved by
tstellar on IRC.
Differential Revision: http://reviews.llvm.org/D21540
llvm-svn: 273751
Summary: Set ProfileSummary in SampleProfilerLoader.
Reviewers: davidxl, eraman
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D21702
llvm-svn: 273745
This fixes an embarrassing bug when emitting .debug_loc entries for 64-bit+ constants,
which were previously silently truncated to 32 bits.
<rdar://problem/26843232>
llvm-svn: 273736
The bitset metadata currently used in LLVM has a few problems:
1. It has the wrong name. The name "bitset" refers to an implementation
detail of one use of the metadata (i.e. its original use case, CFI).
This makes it harder to understand, as the name makes no sense in the
context of virtual call optimization.
2. It is represented using a global named metadata node, rather than
being directly associated with a global. This makes it harder to
manipulate the metadata when rebuilding global variables, summarise it
as part of ThinLTO and drop unused metadata when associated globals are
dropped. For this reason, CFI does not currently work correctly when
both CFI and vcall opt are enabled, as vcall opt needs to rebuild vtable
globals, and fails to associate metadata with the rebuilt globals. As I
understand it, the same problem could also affect ASan, which rebuilds
globals with a red zone.
This patch solves both of those problems in the following way:
1. Rename the metadata to "type metadata". This new name reflects how
the metadata is currently being used (i.e. to represent type information
for CFI and vtable opt). The new name is reflected in the name for the
associated intrinsic (llvm.type.test) and pass (LowerTypeTests).
2. Attach metadata directly to the globals that it pertains to, rather
than using the "llvm.bitsets" global metadata node as we are doing now.
This is done using the newly introduced capability to attach
metadata to global variables (r271348 and r271358).
See also: http://lists.llvm.org/pipermail/llvm-dev/2016-June/100462.html
Differential Revision: http://reviews.llvm.org/D21053
llvm-svn: 273729
This patch moves MSSA's caching walker into MemorySSA, and moves the
actual definition of MSSA's caching walker out of MemorySSA.h. This is
done in preparation for the new walker, which should be out for review
soonish.
Also, this patch removes a field from UpwardsMemoryQuery and has a few
lines of diff from clang-format'ing MemorySSA.cpp.
llvm-svn: 273723
This patch adds round-trip support for MachO Universal binaries to obj2yaml and yaml2obj. Universal binaries have a header and list of architecture structures, followed by a the individual object files at specified offsets.
llvm-svn: 273719
By putting all the possible commutations together, we simplify the code.
Note that this is NFCI, but I'm adding tests that actually exercise each
commutation pattern because we don't have this anywhere else.
llvm-svn: 273702
a good error message to be produced.
This is nearly the last libObject interface that used ErrorOr and the last one
that appears in llvm/include/llvm/Object/MachO.h . For Mach-O objects this is
just a clean up because it’s version of getSymbolAddress() can’t return an
error.
I will leave it to the experts on COFF and ELF to actually add meaning full
error messages in their tests if they wish. And also leave it to these experts
to change the last two ErrorOr interfaces in llvm/include/llvm/Object/ObjectFile.h
for createCOFFObjectFile() and createELFObjectFile() if they wish.
Since there are no test cases for COFF and ELF error cases with respect to
getSymbolAddress() in the test suite this is no functional change (NFC).
llvm-svn: 273701
Tail merge was making the assumption that a layout successor or
predecessor was always a cfg successor/predecessor. Remove that
assumption. Changes to tests are necessary because the errant cfg edges
were preventing optimizations.
llvm-svn: 273700
Clang emits them in reverse order to conform to the ABI, which requires
left-to-right destruction. As a result, the order doesn't fall out
naturally, and we have to sort things out in the backend.
Fixes PR28213
llvm-svn: 273696
We bailed out while printing codeview for an MSVC compiled
SemaExprCXX.cpp that used this record. The MS reference headers look
incorrect here, which is probably why we had this bug. They use a 32-bit
enum as the field type, but the actual record appears to use one byte
for the cookie kind followed by a flags byte.
llvm-svn: 273691