Some are named "FP", others "SD", others still "FP*SD".
Rename all this to just use "FP", which, except for conversions
(which don't use this format naming scheme), implies "SD" anyway.
llvm-svn: 243936
It's already in SysRegMappings, no need to also have it in MSRMappings:
the latter is only used if we didn't find a match in the former.
llvm-svn: 243933
This happens to work, but is not guaranteed to work. Indeed, most memcpy
interfaces in Linux-land annotate these arguments as nonnull, and GCC
and LLVM both can and do optimized based upon that. When they do so,
they might legitimately have miscompiled code calling this routine with
two valid iterators, 'nullptr' and 'nullptr'. There was even code doing
precisely this because StringRef().begin() and StringRef().end() both
produce null pointers.
This was found by UBSan.
llvm-svn: 243927
There's a bunch of code in LowerFCOPYSIGN that does smart lowering, and
is actually already vector-aware; let's use it instead of scalarizing!
The only interesting change is that for v2f32, we previously always used
use v4i32 as the integer vector type.
Use v2i32 instead, and mark FCOPYSIGN as Custom.
llvm-svn: 243926
We used to legalize it like it's any other binary operations. It's not,
because it accepts mismatched operand types. Because of that, we used
to hit various asserts and miscompiles.
Specialize vector legalizations to, in the worst case, unroll, or, when
possible, to just legalize the operand that needs legalization.
Scalarization isn't covered, because I can't think of a target where
some but not all of the 1-element vector types are to be scalarized.
llvm-svn: 243924
Various value handles needed to be copy constructible and copy
assignable (mostly for their use in DenseMap). But to avoid an API that
might allow accidental slicing, make these members protected in the base
class and make derived classes final (the special members become
implicitly public there - but disallowing further derived classes that
might be sliced to the intermediate type).
Might be worth having a warning a bit like -Wnon-virtual-dtor that
catches public move/copy assign/ctors in classes with virtual functions.
(suppressable in the same way - by making them protected in the base,
and making the derived classes final) Could be fancier and only diagnose
them when they're actually called, potentially.
Also allow a few default implementations where custom implementations
(especially with non-standard return types) were implemented.
llvm-svn: 243909
Fixes obvious memory leak in test
TestForEofAfterReadFailureOnDataStreamer. Also removes constexpr use
from same test.
Patch by Karl Schimpf.
Differential Revision: http://reviews.llvm.org/D11735
llvm-svn: 243904
through PHI nodes across iterations.
This patch teaches the new advanced loop unrolling heuristics to propagate
constants into the loop from the preheader and around the backedge after
simulating each iteration. This lets us brute force solve simple recurrances
that aren't modeled effectively by SCEV. It also makes it more clear why we
need to process the loop in-order rather than bottom-up which might otherwise
make much more sense (for example, for DCE).
This came out of an attempt I'm making to develop a principled way to account
for dead code in the unroll estimation. When I implemented
a forward-propagating version of that it produced incorrect results due to
failing to propagate *cost* between loop iterations through the PHI nodes, and
it occured to me we really should at least propagate simplifications across
those edges, and it is quite easy thanks to the loop being in canonical and
LCSSA form.
Differential Revision: http://reviews.llvm.org/D11706
llvm-svn: 243900
While checking for the existence of the clang-tools-extra directory,
the script was not checking for its destination name, "extra", and
the script was failing when re-running without checking out new
sources.
llvm-svn: 243898
Some functions return concrete ByteStreamers by value - explicitly
support that in the base class. (dtor can be virtual, no one seems to be
polymorphically owning/destroying them)
llvm-svn: 243897
This reverts commit r243888, recommitting r243824.
This broke the Windows build due to a difference in the C++ standard
library implementation. Using emplace/forward_as_tuple should ensure
there's no need to copy ValIDs.
llvm-svn: 243896
This fixes a bug found while working on the bitcode reader. In
particular, the method BitstreamReader::AtEndOfStream doesn't always
behave correctly when processing a data streamer. The method
fillCurWord doesn't properly set CurWord/BitsInCurWord if the data
streamer was already at eof, but GetBytes had not yet set the
ObjectSize field of the streaming memory object.
This patch fixes this problem, and provides a test to show that
this problem has been fixed.
Patch by Karl Schimpf.
Differential Revision: http://reviews.llvm.org/D11391
llvm-svn: 243890
Since r241097, `DIBuilder` has only created distinct `DICompileUnit`s.
The backend is liable to start relying on that (if it hasn't already),
so make uniquable `DICompileUnit`s illegal and automatically upgrade old
bitcode. This is a nice cleanup, since we can remove an unnecessary
`DenseSet` (and the associated uniquing info) from `LLVMContextImpl`.
Almost all the testcases were updated with this script:
git grep -e '= !DICompileUnit' -l -- test |
grep -v test/Bitcode |
xargs sed -i '' -e 's,= !DICompileUnit,= distinct !DICompileUnit,'
I imagine something similar should work for out-of-tree testcases.
llvm-svn: 243885
This is necessary for WatchOS support, where the compact unwind format assumes
this kind of layout. For now we only want this on Swift-like CPUs though, where
it's been the Xcode behaviour for ages. Also, since it can expand the prologue
we don't want it at -Oz.
llvm-svn: 243884
Instead of cloning distinct `MDNode`s when linking in a module, just
move them over. The module linker destroys the source module, so the
old node would otherwise just be leaked on the context. Create the new
node in place. This also reduces the number of cloned uniqued nodes
(since it's less likely their operands have changed).
This mapping strategy is only correct when we're discarding the source,
so the linker turns it on via a ValueMapper flag, `RF_MoveDistinctMDs`.
There's nothing observable in terms of `llvm-link` output here: the
linked module should be semantically identical.
I'll be adding more 'distinct' nodes to the debug info metadata graph in
order to break uniquing cycles, so the benefits of this will partly come
in future commits. However, we should get some gains immediately, since
we have a fair number of 'distinct' `DILocation`s being linked in.
llvm-svn: 243883
Instead of always showing/printing all functions, a class derived from
the DOTViewer class can overwrite the set of functions that will be
processed.
This will be used (and tested) by Polly's scop viewers, but other users
can be imagined as well.
llvm-svn: 243881
Summary:
This is useful for PNaCl's `RewriteAtomics` pass. NaCl intrinsics don't exist for some of the more exotic RMW instructions, so by refactoring this function into its own, `RewriteAtomics` can share code rewriting those atomics with `AtomicExpand` while additionally saving a few cycles by generating the `cmpxchg` NaCl-specific intrinsic with the callback. Without this patch, `RewriteAtomics` would require two extra passes over functions, by first requiring use of the full `AtomicExpand` pass to just expand the leftover exotic RMWs and then running itself again to expand resulting `cmpxchg`s.
NFC
Reviewers: jfb
Subscribers: jfb, llvm-commits
Differential Revision: http://reviews.llvm.org/D11422
llvm-svn: 243880
* generate function with string attribute using API,
* dump it in LL format,
* try to parse.
Add parser support for string attributes to fix the issue.
Reviewed By: reames, hfinkel
Differential Revision: http://reviews.llvm.org/D11058
llvm-svn: 243877
Summary:
Modify the cost calculation function for interleaved accesses
to use the target-specific costs for insert/extract element and
memory operations.
This better models the case where the backend can't match
the interleaved group, and we are forced to use a wide load
and shuffle vectors.
Interleaved accesses are not enabled by default, so this shouldn't
cause a performance change.
Reviewers: jmolloy
Subscribers: jmolloy, llvm-commits
Differential Revision: http://reviews.llvm.org/D11718
llvm-svn: 243875
Enabling merging of extern globals appears to be generally either beneficial or
harmless. On some benchmarks suites (on Cortex-M4F, Cortex-A9, and Cortex-A57)
it gives improvements in the 1-5% range, but in the rest the overall effect is
zero.
Differential Revision: http://reviews.llvm.org/D10966
llvm-svn: 243874
Adjust the GlobalMergeOnExternal option so that the default behaviour is to
do whatever the Target thinks is best. Explicitly enabled or disabling the
option will override this default.
Differential Revision: http://reviews.llvm.org/D10965
llvm-svn: 243873
The test/DebugInfo/dwarfdump-macho-universal.test test added in r243862 uses
an input from another test's directory (test/tools/dsymutil/Inputs/fat-test.o)
which breaks our test setup.
Copying the required test input to the test's Input directory to fix the issue.
llvm-svn: 243872
In http://reviews.llvm.org/rL215382, IT forming was made more conservative under
the belief that a flag-setting instruction was unpredictable inside an IT block on ARMv6M.
But actually, ARMv6M doesn't even support IT blocks so that's impossible. In the ARMARM for
v7M, v7AR and v8AR it states that the semantics of such an instruction changes inside an
IT block - it doesn't set the flags. So actually it is fine to use one inside an IT block
as long as the flags register is dead afterwards.
This gives significant performance improvements in a variety of MPEG based workloads.
Differential revision: http://reviews.llvm.org/D11680
llvm-svn: 243869
This is a minor optimization to only check for unresolved operands
inside `mapDistinctNode()` if the operands have actually changed. This
shouldn't really cause any change in behaviour. I didn't actually see a
slowdown in a profile, I was just poking around nearby and saw the
opportunity.
llvm-svn: 243866
Summary: This currently sets the shift amount RHS to the same type as the LHS, and assumes that the LHS is a simple type. This isn't currently the case e.g. with weird integers sizes, but will eventually be true and will assert if not. That's what you get for having an experimental backend: break it and you get to keep both pieces. Most backends either set the RHS to MVT::i32 or MVT::i64, but WebAssembly is a virtual ISA and tries to have regular-looking binary operations where both operands are the same type (even if a 64-bit RHS shifter is slightly silly, hey it's free!).
Subscribers: llvm-commits, sunfish, jfb
Differential Revision: http://reviews.llvm.org/D11715
llvm-svn: 243860
Change `DIELoc` and `DIEBlock` to stop inheriting from `DIE`, instead
inheriting from `DIEValueList` to share the value storage API. This
awkward bit of code-sharing was also fairly confusing: neither `DIELoc`
nor `DIEBlock` represents a `DIE`, so why would they inherit from it?
Aside from the API cleanup, this should improve debug info memory usage
in the backend, since it shaves five pointers off of every `DIELoc` and
`DIEBlock`. I haven't bothered to measure the savings, though.
llvm-svn: 243858
Split out a helper `printValues()` for printing `DIEBlock` and `DIELoc`,
instead of relying on `DIE::print()`. The shared code was actually
fairly small there. No functionality change intended.
llvm-svn: 243856
Rewrite `DIEValueList` as a subclass of `DIE`, renaming its API to match
`DIE`'s. This is preparation for changing `DIEBlock` and `DIELoc` to
stop inheriting from `DIE` and inherit directly from `DIEValueList`.
I thought about leaving this as a has-a relationship (and changing
`DIELoc` and `DIEBlock` to also have-a `DIEValueList`), but that seemed
to require a fair bit more boilerplate and I think it needed more
changes to the `DwarfUnit` API than this will.
No functionality change intended here.
llvm-svn: 243854
The XformToShuffleWithZero method currently checks AND masks at the per-lane level for all-one and all-zero constants and attempts to convert them to legal shuffle clear masks.
This patch generalises XformToShuffleWithZero, splitting and checking the sub-lanes of the constants down to the byte level to see if any legal shuffle clear masks are possible. This allows a lot of masks (often from legalization or truncation) to be folded into existing shuffle patterns and removes a lot of constant mask loading.
There are a few examples of poor shuffle lowering that are exposed by this patch that will be cleaned up in future patches (e.g. merging shuffles that are separated by bitcasts, x86 legalized v8i8 zero extension uses PMOVZX+AND+AND instead of AND+PMOVZX, etc.)
Differential Revision: http://reviews.llvm.org/D11518
llvm-svn: 243831
Remove some unnecessary explicit special members in Hexagon that, once
removed, allow the other implicit special members to be used without
depending on deprecated features.
llvm-svn: 243825
Summary: Also test 64-bit integers, except shifts for now which are broken because isel dislikes the 32-bit truncate that precedes them.
Reviewers: sunfish
Subscribers: llvm-commits, jfb
Differential Revision: http://reviews.llvm.org/D11699
llvm-svn: 243822
Various targets use std::swap on specific MCAsmOperands (ARM and
possibly Hexagon as well). It might be helpful to mark those subclasses
as final, to ensure that the availability of move/copy operations can't
lead to slicing. (same sort of requirements as the non-vitual dtor -
protected or a final class)
llvm-svn: 243820
This commit fixes a bug in the class 'SIInstrInfo' where the implicit register
machine operands were added to a machine instruction in an incorrect order -
the implicit uses were added before the implicit defs.
I found this bug while working on moving the implicit register operand
verification code from the MIR parser to the machine verifier.
This commit also makes the method 'addImplicitDefUseOperands' in the machine
instruction class public so that it can be reused in the 'SIInstrInfo' class.
Reviewers: Matt Arsenault
Differential Revision: http://reviews.llvm.org/D11689
llvm-svn: 243799
Summary:
For example, in
struct S {
int *x;
int *y;
};
__global__ void foo(S s) {
int *b = s.y;
// use b
}
"b" is guaranteed to point to global. NVPTX should emit ld.global/st.global for
accessing "b".
Reviewers: jholewinski
Subscribers: llvm-commits, jholewinski
Differential Revision: http://reviews.llvm.org/D11505
llvm-svn: 243790
Summary:
Use -1 as numoperands for the return SDTypeProfile, denoting that return is variadic. Note that the patterns in InstrControl.td still need to match the inputs, so this ins't an "anything goes" variadic on ret!
The next step will be to handle other local types (not just int32).
Reviewers: sunfish
Subscribers: llvm-commits, jfb
Differential Revision: http://reviews.llvm.org/D11692
llvm-svn: 243783
Successive versions of LLVM should retain the ability to parse bitcode
generated by old releases of the compiler. This adds a bitcode format
compatibility test, which is intended to provide good (albeit not
entirely exhaustive) coverage of the current LangRef.
This also includes compatibility tests for LLVM 3.6. After every 3.X.0
release, the compatibility.ll file from the 3.X branch should be copied
to compatibility-3.X.ll on trunk, and the 3.X.0 release used to generate
a corresponding bitcode file.
Patch by Vedant Kumar!
llvm-svn: 243779
When encountering a scattered relocation, the code would assert trying to
access an unexisting section. I couldn't find a way to expose the result
of the processing of a scattered reloc, and I'm really unsure what the
right thing to do is. This patch just skips them during the processing in
DwarfContext and adds a mach-o file to the tests that exposed the asserting
behavior.
(This is a new failure that is being exposed by Rafael's recent work on
the libObject interfaces. I think the wrong behavior has always happened,
but now it's asserting)
llvm-svn: 243778
Remove the fake `DW_TAG_auto_variable` and `DW_TAG_arg_variable` tags,
using `DW_TAG_variable` in their place Stop exposing the `tag:` field at
all in the assembly format for `DILocalVariable`.
Most of the testcase updates were generated by the following sed script:
find test/ -name "*.ll" -o -name "*.mir" |
xargs grep -l 'DILocalVariable' |
xargs sed -i '' \
-e 's/tag: DW_TAG_arg_variable, //' \
-e 's/tag: DW_TAG_auto_variable, //'
There were only a handful of tests in `test/Assembly` that I needed to
update by hand.
(Note: a follow-up could change `DILocalVariable::DILocalVariable()` to
set the tag to `DW_TAG_formal_parameter` instead of `DW_TAG_variable`
(as appropriate), instead of having that logic magically in the backend
in `DbgVariable`. I've added a FIXME to that effect.)
llvm-svn: 243774