Commit Graph

24974 Commits

Author SHA1 Message Date
Andrew Kaylor 16c4da03d5 Improved the interface of methods commuting operands, improved X86-FMA3 mem-folding&coalescing.
Patch by Slava Klochkov (vyacheslav.n.klochkov@intel.com)

Differential Revision: http://reviews.llvm.org/D11370

llvm-svn: 248735
2015-09-28 20:33:22 +00:00
Artur Pilipenko b4d009042b Introduce !align metadata for load instruction
Reviewed By: hfinkel

Differential Revision: http://reviews.llvm.org/D12853

llvm-svn: 248721
2015-09-28 17:41:08 +00:00
Justin Bogner d7d1a72f66 AsmWriter: Print the argument names in declarations while debugging
When llvm declarations have argument names, it's helpful to actually
print those names when debugging. Arguably, it'd be nice to print them
all the time, but that would mean the IR we output wouldn't round trip
through bitcode, which doesn't store the names.

Make the varous print() methods in AsmWriter optionally print "for
debug" and set that flag in the dump() methods. The only thing this
does differently for now is print the argument names in declarations.

llvm-svn: 248692
2015-09-27 22:38:50 +00:00
Joseph Tremoulet 09af67aba5 [EH] Create removeUnwindEdge utility
Summary:
Factor the code that rewrites invokes to calls and rewrites WinEH
terminators to their "unwind to caller" equivalents into a helper in
Utils/Local, and use it in the three places I'm aware of that need to do
this.


Reviewers: andrew.w.kaylor, majnemer, rnk

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D13152

llvm-svn: 248677
2015-09-27 01:47:46 +00:00
Benjamin Kramer 2a63631abd [BranchProbability] Manually round the floating point output.
llvm::format compiles down to snprintf which has no defined rounding for
floating point arguments, and MSVC has implemented it differently from
what the BSD libcs and glibc do. Try to emulate the glibc rounding
behavior to avoid changing tests.

While there simplify code a bit and move trivial methods inline.

llvm-svn: 248665
2015-09-26 10:09:36 +00:00
Sanjoy Das 96709c4854 [SCEV] Reapply 'Exploit A < B => (A+K) < (B+K) when possible'
Summary:

This change teaches SCEV's `isImpliedCond` two new identities:

  A u< B u< -C          =>  (A + C) u< (B + C)
  A s< B s< INT_MIN - C =>  (A + C) s< (B + C)

While these are useful on their own, they're really intended to support
D12950.

The original checkin, r248606 had to be backed out due to an issue with
a ObjCXX unit test.  That issue is now fixed, so re-landing.

Reviewers: atrick, reames, majnemer, nlewycky, hfinkel

Subscribers: aadg, sanjoy, llvm-commits

Differential Revision: http://reviews.llvm.org/D12948

llvm-svn: 248637
2015-09-25 23:53:45 +00:00
Matthias Braun 93ab942c24 LivePhysRegs: Fix live-outs of return blocks
I realized that the live-out set computed for the return block is
missing the callee saved registers (the non-pristine ones to be exact).

This only affects the liveness computed for instructions inside the
function epilogue which currently none of the LivePhysRegs users in llvm
cares about, so this is just a drive-by fix without a testcase.

Differential Revision: http://reviews.llvm.org/D13180

llvm-svn: 248636
2015-09-25 23:50:53 +00:00
Cong Hou 15ea016346 Use fixed-point representation for BranchProbability.
BranchProbability now is represented by its numerator and denominator in uint32_t type. This patch changes this representation into a fixed point that is represented by the numerator in uint32_t type and a constant denominator 1<<31. This is quite similar to the representation of BlockMass in BlockFrequencyInfoImpl.h. There are several pros and cons of this change:

Pros:

1. It uses only a half space of the current one.
2. Some operations are much faster like plus, subtraction, comparison, and scaling by an integer.

Cons:

1. Constructing a probability using arbitrary numerator and denominator needs additional calculations.
2. It is a little less precise than before as we use a fixed denominator. For example, 1 - 1/3 may not be exactly identical to 1 / 3 (this will lead to many BranchProbability unit test failures). This should not matter when we only use it for branch probability. If we use it like a rational value for some precise calculations we may need another construct like ValueRatio.

One important reason for this change is that we propose to store branch probabilities instead of edge weights in MachineBasicBlock. We also want clients to use probability instead of weight when adding successors to a MBB. The current BranchProbability has more space which may be a concern.

Differential revision: http://reviews.llvm.org/D12603

llvm-svn: 248633
2015-09-25 23:09:59 +00:00
Matthias Braun c804cdb912 TargetRegisterInfo: Introduce PrintLaneMask.
This makes it more convenient to print lane masks and lead to more
uniform printing.

llvm-svn: 248624
2015-09-25 21:51:24 +00:00
Matthias Braun e6a2485e1a TargetRegisterInfo: Add typedef unsigned LaneBitmask and use it where apropriate; NFC
llvm-svn: 248623
2015-09-25 21:51:14 +00:00
Tom Stellard 8e0257625d MCAsmInfo: Allow targets to specify when the .section directive should be omitted
Summary:
The default behavior is to omit the .section directive for .text, .data,
and sometimes .bss, but some targets may want to omit this directive for
other sections too.

The AMDGPU backend will uses this to emit a simplified syntax for section
switches.  For example if the section directive is not omitted (current
behavior), section switches to .hsatext will be printed like this:

.section .hsatext,#alloc,#execinstr,#write

This is actually wrong, because .hsatext has some custom STT_* flags,
which MC doesn't know how to print or parse.

If the section directive is omitted (made possible by this commit),
section switches will be printed like this:

.hsatext

The motivation for this patch is to make it possible to emit sections
with custom STT_* flags without having to teach MC about all the target
specific STT_* flags.

Reviewers: rafael, grosbach

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D12423

llvm-svn: 248618
2015-09-25 21:41:14 +00:00
Matthias Braun c2d4befb54 MachineBasicBlock: Factor out common code into isReturnBlock()
llvm-svn: 248617
2015-09-25 21:25:19 +00:00
Sanjoy Das 4a39b97671 Revert two SCEV changes that caused test failures in clang.
r248606: "[SCEV] Exploit A < B => (A+K) < (B+K) when possible"
r248608: "[SCEV] Teach isLoopBackedgeGuardedByCond to exploit trip counts."
llvm-svn: 248614
2015-09-25 21:16:50 +00:00
Sanjoy Das df1635d394 [SCEV] Extract helper function from isImpliedCond; NFC
Summary:
This new helper routine will be used in a subsequent change.

Reviewers: hfinkel

Subscribers: hfinkel, sanjoy, llvm-commits

Differential Revision: http://reviews.llvm.org/D12949

llvm-svn: 248607
2015-09-25 19:59:52 +00:00
Sanjoy Das fdec9deb13 [SCEV] Exploit A < B => (A+K) < (B+K) when possible
Summary:

This change teaches SCEV's `isImpliedCond` two new identities:

  A u< B u< -C          =>  (A + C) u< (B + C)
  A s< B s< INT_MIN - C =>  (A + C) s< (B + C)

While these are useful on their own, they're really intended to support
D12950.

Reviewers: atrick, reames, majnemer, nlewycky, hfinkel

Subscribers: aadg, sanjoy, llvm-commits

Differential Revision: http://reviews.llvm.org/D12948

llvm-svn: 248606
2015-09-25 19:59:49 +00:00
James Molloy eb46641c28 [GlobalsAA] Teach GlobalsAA about nocapture
Arguments to function calls marked "nocapture" can be marked as
non-escaping. However, nocapture is defined in terms of the lifetime
of the callee, and if the callee can directly or indirectly recurse to
the caller, the semantics of nocapture are invalid.

Therefore, we eagerly discover which SCC each function belongs to,
and later can check if callee and caller of a callsite belong to
the same SCC, in which case there could be recursion.

This means that we can't be so optimistic in
getModRefInfo(ImmutableCallsite) - previously we assumed all call
arguments never aliased with an escaping global. Now we need to check,
because a global could now be passed as an argument but still not
escape.

This also solves a related conformance problem: MemCpyOptimizer can
turn non-escaping stores of globals into calls to intrinsics like
llvm.memcpy/llvm/memset. This confuses GlobalsAA, which knows the
global can't escape and so returns NoModRef when queried, when
obviously a memcpy/memset call does indeed reference and modify its
arguments.

This fixes PR24800, PR24801, and PR24802.

llvm-svn: 248576
2015-09-25 15:39:29 +00:00
Sanjoy Das b513a9fa4f [Bitcode][Asm] Teach LLVM to read and write operand bundles.
Summary:
This also adds the first set of tests for operand bundles.

The optimizer has not been audited to ensure that it does the right
thing with operand bundles.

Depends on D12456.

Reviewers: reames, chandlerc, majnemer, dexonsmith, kmod, JosephTremoulet, rnk, bogner

Subscribers: maksfb, llvm-commits

Differential Revision: http://reviews.llvm.org/D12457

llvm-svn: 248551
2015-09-24 23:34:52 +00:00
Rafael Espindola 4405d5d889 Use ELFOSABI_NONE instead of ELFOSABI_LINUX.
The doesn't seem to be a difference and ELFOSABI_NONE seems to be far more
common:

* Linux doesn't care when loading and puts ELFOSABI_NONE on core dumps.
* Gold and bfd ld produce files with ELFOSABI_NONE.
* Gold and bfd ld seems to ignore EI_OSABI other than for freebsd.
* Gas puts ELFOSABI_NONE in most .o files.

llvm-svn: 248534
2015-09-24 20:57:24 +00:00
Matt Arsenault e66621b306 AMDGPU: Add s_dcache_* instructions
llvm-svn: 248533
2015-09-24 19:52:27 +00:00
Matt Arsenault d6adfb401c AMDGPU: Add cache invalidation instructions.
These are necessary for implementing mem_fence for
OpenCL 2.0.

The VI assembler tests are disabled since it seems to be
using the wrong encoding or opcode.

llvm-svn: 248532
2015-09-24 19:52:21 +00:00
Sanjoy Das 9303c24650 [IR] Add operand bundles to CallInst and InvokeInst.
Summary:
This change teaches `CallInst`s and `InvokeInst`s to maintain a set of
operand bundles as part of its operands.  `CallInst`s and `InvokeInst`s
with operand bundles co-allocate some space before their `Use` array to
hold meta information about which of its operands are part of an operand
bundle.

The strings corresponding to the bundle tags are interned into
`LLVMContextImpl::BundleTagCache`

This change does not include any parsing / bitcode support.  That's the
next change.

Depends on D12455.

Reviewers: reames, chandlerc, majnemer, dexonsmith, kmod, JosephTremoulet, rnk, bogner

Subscribers: MatzeB, sanjoy, llvm-commits

Differential Revision: http://reviews.llvm.org/D12456

llvm-svn: 248527
2015-09-24 19:14:18 +00:00
Artyom Skrobov cf296444ab [ARM] Handle +t2dsp feature as an ArchExtKind in ARMTargetParser.def
Currently, the availability of DSP instructions (ACLE 6.4.7) is handled in a
hand-rolled tricky condition block in tools/clang/lib/Basic/Targets.cpp, with
a FIXME: attached.

This patch changes the handling of +t2dsp to be in line with other
architecture extensions.

Following a revert of r248152 and new review comments, this patch also includes
renaming FeatureDSPThumb2 -> FeatureDSP, hasThumb2DSP() -> hasDSP(), etc.
The spelling of "t2dsp" is preserved, pending a further investigation of its
possible external usage.

Differential Revision: http://reviews.llvm.org/D12937

llvm-svn: 248519
2015-09-24 17:31:16 +00:00
Matt Arsenault 68d938649e Introduce target hook for optimizing register copies
Allow a target to do something other than search for copies
that will avoid cross register bank copies.

Implement for SI by only rewriting the most basic copies,
so it should look through anything like a subregister extract.

I'm not entirely satisified with this because it seems like
eliminating a reg_sequence that isn't fully used should work
generically for all targets without them having to override
something. However, it seems to be tricky to have a simple
implementation of this without rewriting to invalid  kinds
of subregister copies on some targets.

I'm not sure if there is currently a generic way to easily check
if a subregister index would be valid for the current use.
The current set of TargetRegisterInfo::get*Class functions don't
quite behave like I would expect (e.g. getSubClassWithSubReg
returns the maximal register class rather than the minimal), so
I'm not sure how to make the generic test keep searching if
SrcRC:SrcSubReg is a valid replacement for DefRC:DefSubReg. Making
the default implementation to check for simple copies breaks
a variety of ARM and x86 tests by producing illegal subregister uses.

The ARM tests are not actually changed since it should still be using
the same sharesSameRegisterFile implementation, this just relaxes
them to not check for specific registers.

llvm-svn: 248478
2015-09-24 08:36:14 +00:00
Justin Bogner 49655f806f BasicAA: Move BasicAAResult::alias out-of-line. NFC
This makes the header more readable and cleans up some unnecessary
header differences between NDEBUG and !NDEBUG.

llvm-svn: 248462
2015-09-24 04:59:24 +00:00
Sanjoy Das b07fc572bd [IR] Teach `llvm::User` to co-allocate a descriptor.
Summary:
With this change, subclasses of `llvm::User` will be able to co-allocate
a variable number of bytes (called a "descriptor") with the `llvm::User`
instance.  The co-allocated descriptor can later be accessed using
`llvm::User::getDescriptor`.  This will be used in later changes to
implement operand bundles.

This change steals one bit from `NumUserOperands`, but given that it is
still 28 bits wide I don't think this will be a practical issue.

This change does not allow allocating hung off uses with descriptors.
This only for simplicity, not for any fundamental reason; and we can
easily add this functionality later if needed.

Reviewers: reames, chandlerc, dexonsmith, kmod, majnemer, pete, JosephTremoulet

Subscribers: pete, sanjoy, llvm-commits

Differential Revision: http://reviews.llvm.org/D12455

llvm-svn: 248453
2015-09-24 01:00:49 +00:00
Rui Ueyama 464d8eb9b7 Remove iterator_range::end.
Because the current proposal does not include that member function,
and we are trying to keep in line with that.

llvm-svn: 248451
2015-09-24 00:23:07 +00:00
Rui Ueyama ffc9b444a4 Add iterator_range::end() predicate.
llvm-svn: 248447
2015-09-23 23:58:29 +00:00
Sanjay Patel 13e8bbc237 set div/rem default values to 'expensive' in TargetTransformInfo's cost model
...because that's what the cost model was intended to do.

As discussed in D12882, this fix has a temporary unintended consequence for
SimplifyCFG: it causes us to not speculate an fdiv. However, two wrongs make
PR24818 right, and two wrongs make PR24343 act right even though it's really
still wrong.

I intend to correct SimplifyCFG and add to CodeGenPrepare to account for this
cost model change and preserve the righteousness for the bug report cases.

https://llvm.org/bugs/show_bug.cgi?id=24818
https://llvm.org/bugs/show_bug.cgi?id=24343

Differential Revision: http://reviews.llvm.org/D12882

llvm-svn: 248439
2015-09-23 22:28:18 +00:00
Philip Reames e144025652 [docs] Update DominatorTree docs to clarify expectations around unreachable blocks
Note: I'm am not trying to describe what "should be"; I'm only describing what is true today.

This came out of my recent question to llvm-dev titled: When can the dominator tree not contain a node for a basic block?

Differential Revision: http://reviews.llvm.org/D13078

llvm-svn: 248417
2015-09-23 18:39:37 +00:00
Evgeniy Stepanov a2002b08f7 Android support for SafeStack.
Add two new ways of accessing the unsafe stack pointer:

* At a fixed offset from the thread TLS base. This is very similar to
  StackProtector cookies, but we plan to extend it to other backends
  (ARM in particular) soon. Bionic-side implementation here:
  https://android-review.googlesource.com/170988.
* Via a function call, as a fallback for platforms that provide
  neither a fixed TLS slot, nor a reasonable TLS implementation (i.e.
  not emutls).

This is a re-commit of a change in r248357 that was reverted in
r248358.

llvm-svn: 248405
2015-09-23 18:07:56 +00:00
Sanjoy Das 2aacc0ecca [SCEV] Introduce ScalarEvolution::getOne and getZero.
Summary:
It is fairly common to call SE->getConstant(Ty, 0) or
SE->getConstant(Ty, 1); this change makes such uses a little bit
briefer.

I've refactored the call sites I could find easily to use getZero /
getOne.

Reviewers: hfinkel, majnemer, reames

Subscribers: sanjoy, llvm-commits

Differential Revision: http://reviews.llvm.org/D12947

llvm-svn: 248362
2015-09-23 01:59:04 +00:00
Evgeniy Stepanov 8d0e3011d8 Revert "Android support for SafeStack."
test/Transforms/SafeStack/abi.ll breaks when target is not supported;
needs refactoring.

llvm-svn: 248358
2015-09-23 01:23:22 +00:00
Evgeniy Stepanov ce2e16f00c Android support for SafeStack.
Add two new ways of accessing the unsafe stack pointer:

* At a fixed offset from the thread TLS base. This is very similar to
  StackProtector cookies, but we plan to extend it to other backends
  (ARM in particular) soon. Bionic-side implementation here:
  https://android-review.googlesource.com/170988.
* Via a function call, as a fallback for platforms that provide
  neither a fixed TLS slot, nor a reasonable TLS implementation (i.e.
  not emutls).

llvm-svn: 248357
2015-09-23 01:03:51 +00:00
Adrian Prantl 4410392e96 IR: Add a setDWOId() method to DICompileUnit.
Tested via clang.

llvm-svn: 248342
2015-09-22 23:21:06 +00:00
Adrian Prantl fbfc58e112 IR: Fix the return value of DICompileUnit::getDWOId.
llvm-svn: 248341
2015-09-22 23:21:03 +00:00
Aaron Ballman 7eb86d1d2d Instead of defining the operator delete() function, it is better to delete the function so that any uses (even from within Node or its subclasses) do not accidentally call it. NFC intended.
llvm-svn: 248320
2015-09-22 21:00:35 +00:00
Ahmed Bougacha 07a844d758 [AArch64] Emit clrex in the expanded cmpxchg fail block.
In the comparison failure block of a cmpxchg expansion, the initial
ldrex/ldxr will not be followed by a matching strex/stxr.
On ARM/AArch64, this unnecessarily ties up the execution monitor,
which might have a negative performance impact on some uarchs.

Instead, release the monitor in the failure block.
The clrex instruction was designed for this: use it.

Also see ARMARM v8-A B2.10.2:
"Exclusive access instructions and Shareable memory locations".

Differential Revision: http://reviews.llvm.org/D13033

llvm-svn: 248291
2015-09-22 17:21:44 +00:00
NAKAMURA Takumi 10c80e7996 Prune trailing whitespaces.
llvm-svn: 248265
2015-09-22 11:19:03 +00:00
NAKAMURA Takumi a9cb538a74 Reformat blank lines.
llvm-svn: 248263
2015-09-22 11:14:39 +00:00
NAKAMURA Takumi 84965031a7 Reformat comment lines.
llvm-svn: 248262
2015-09-22 11:14:12 +00:00
NAKAMURA Takumi 70ad98aca4 Reformat.
llvm-svn: 248261
2015-09-22 11:13:55 +00:00
Matthias Braun d3dd1354a4 LiveIntervalAnalysis: Factor common code into splitSeparateComponents; NFC
llvm-svn: 248241
2015-09-22 03:44:41 +00:00
Matthias Braun 55a72361d7 Remove declarations for methods that do not exist.
llvm-svn: 248238
2015-09-22 01:52:44 +00:00
NAKAMURA Takumi 0ce5686f92 Fix r248164. [-Wdocumentation]
llvm-svn: 248237
2015-09-22 01:44:21 +00:00
Stephen Canon b12db0e42c Remove roundingMode argument in APFloat::mod
Because mod is always exact, this function should have never taken a rounding mode argument.  The actual implementation still has issues, which I'll look at resolving in a subsequent patch.

llvm-svn: 248195
2015-09-21 19:29:25 +00:00
Chandler Carruth be58ff8232 [ADT] Remove a couple of the always inline attributes I added.
Based on conversations with Justin and a few others, these constructors
are really useful to have in the executable so that you can call them
from the debugger. After some measurements, these *particular* calls
aren't so problematic as to make them a good tradeoff for always inline.

Please let me know if there are other functions really needed for
debugging. The always inline attribute is a hack that we should only
really employ when it doesn't hurt.

llvm-svn: 248188
2015-09-21 18:02:24 +00:00
Marcello Maggioni ab58c74d98 [DivergenceAnalysis] Separated definition of class into header.
The definition of the DivergenceAnalysis pass was in a CPP
file and wasn't accessible to users of the analysis to get it
through "getAnalysis<>()".
This patch extracts the definition into a separate header that
can be used by users of the analysis to fetch the results.

Patch by Volkan Keles (vkeles@apple.com)

llvm-svn: 248186
2015-09-21 17:58:14 +00:00
James Molloy e46da3849a Revert "[ARM] Handle +t2dsp feature as an ArchExtKind in ARMTargetParser.def"
This was committed without the code review (http://reviews.llvm.org/D12937) being approved.

This reverts commit r248152.

llvm-svn: 248174
2015-09-21 16:35:08 +00:00
Matt Arsenault 30ad1a3c01 Fix missing C++ mode comment
llvm-svn: 248167
2015-09-21 15:59:41 +00:00
Chad Rosier 03a47305ec [Machine Combiner] Refactor machine reassociation code to be target-independent.
No functional change intended.
Patch by Haicheng Wu <haicheng@codeaurora.org>!

http://reviews.llvm.org/D12887
PR24522

llvm-svn: 248164
2015-09-21 15:09:11 +00:00