Commit Graph

130670 Commits

Author SHA1 Message Date
Andrew Kaylor 5b444a21df Add optimization bisect opt-in calls for Hexagon passes
Differential Revision: http://reviews.llvm.org/D19509

llvm-svn: 267593
2016-04-26 19:46:28 +00:00
Zachary Turner ff788aa0ee Fix warnings and -Werror build on clang.
llvm-svn: 267589
2016-04-26 19:24:10 +00:00
Zachary Turner 53a65ba5c9 Parse and dump PDB DBI Stream Header Information
The DBI stream contains a lot of bookkeeping information for other
streams. In particular it contains information about section contributions
and linked modules. This patch is a first attempt at parsing some of the
information out of the DBI stream. It currently only parses and dumps the
headers of the DBI stream, so none of the module data or section
contribution data is pulled out.

This is just a proof of concept that we understand the basic properties of
the DBI stream's metadata, and followup patches will try to extract more
detailed information out.

Differential Revision: http://reviews.llvm.org/D19500
Reviewed By: majnemer, ruiu

llvm-svn: 267585
2016-04-26 18:42:34 +00:00
Krzysztof Parzyszek 4773f647bd [Tail duplication] Handle source registers with subregisters
When a block is tail-duplicated, the PHI nodes from that block are
replaced with appropriate COPY instructions. When those PHI nodes
contained use operands with subregisters, the subregisters were
dropped from the COPY instructions, resulting in incorrect code.

Keep track of the subregister information and use this information
when remapping instructions from the duplicated block.

Differential Revision: http://reviews.llvm.org/D19337

llvm-svn: 267583
2016-04-26 18:36:34 +00:00
Tim Northover 4397837be2 Reapply: "ARM: put correct symbol index on indirect pointers in __thread_ptr.""
A latent bug in llvm-objdump used the wrong format specifier on 32-bit
targets, causing the test to fail. This fixes the issue.

llvm-svn: 267582
2016-04-26 18:29:16 +00:00
Justin Bogner 4d0dcb9891 Internalize: More consistent file header and include guards. NFC
Match the style here to the other headers in Transforms/IPO.

llvm-svn: 267581
2016-04-26 18:25:30 +00:00
David Majnemer 8cd77baebc [SimplifyLibCalls] sprintf doesn't copy null bytes
sprintf doesn't read or copy the terminating null byte from it's string
operands.  sprintf will append it's own after processing all of the
format specifiers.

This fixes PR27526.

llvm-svn: 267580
2016-04-26 18:16:49 +00:00
Manman Ren 1c3f65a18c Swift Calling Convention: use %RAX for sret.
We don't need to copy the sret argument into %rax upon return.
rdar://25671494

llvm-svn: 267579
2016-04-26 18:08:06 +00:00
Saleem Abdulrasool 4c6c4e2bbb tests: tweak MIR for ARM tests to correct MI issues
The Machine Instruction Verifier flagged some issues in the serialized MIR.
Adjust the input to correct them.

Fixes the remaining portion of PR27480.

llvm-svn: 267578
2016-04-26 17:54:21 +00:00
Saleem Abdulrasool 601e029ba3 test: remove some bleeding whitespace
Kill bleeding whitespace.  NFC

llvm-svn: 267577
2016-04-26 17:54:16 +00:00
Konstantin Zhuravlyov 71515e57f9 [AMDGPU] Move reserved vgpr count for trap handler usage to SIMachineFunctionInfo + minor commenting changes
Differential Revision: http://reviews.llvm.org/D19537

llvm-svn: 267573
2016-04-26 17:24:40 +00:00
Sanjay Patel d66607bd8c [CodeGenPrepare] use branch weight metadata to decide if a select should be turned into a branch
This is part of solving PR27344:
https://llvm.org/bugs/show_bug.cgi?id=27344

CGP should undo the SimplifyCFG transform for the same reason that earlier patches have used this
same mechanism: it's possible that passes between SimplifyCFG and CGP may be able to optimize the
IR further with a select in place.

For the TLI hook default, >99% taken or not taken is chosen as the default threshold for a highly
predictable branch. Even the most limited HW branch predictors will be correct on this branch almost
all the time, so even a massive mispredict penalty perf loss would be overcome by the win from all
the times the branch was predicted correctly.

As a follow-up, we could make the default target hook less conservative by using the SchedMachineModel's
MispredictPenalty. Or we could just let targets override the default by implementing the hook with that
and other target-specific options. Note that trying to statically determine mispredict rates for 
close-to-balanced profile weight data is generally impossible if the HW is sufficiently advanced. Ie, 
50/50 taken/not-taken might still be 100% predictable.

Finally, note that this patch as-is will not solve PR27344 because the current __builtin_unpredictable()
branch weight default values are 4 and 64. A proposal to change that is in D19435.

Differential Revision: http://reviews.llvm.org/D19488

llvm-svn: 267572
2016-04-26 17:11:17 +00:00
Zachary Turner ce36c1f2ec Fix build broken due to order of initialization problem.
llvm-svn: 267571
2016-04-26 16:57:53 +00:00
Zachary Turner f34e01624a Refactor some more PDB reading code into DebugInfoPDB.
Differential Revision: http://reviews.llvm.org/D19445
Reviewed By: David Majnemer

llvm-svn: 267564
2016-04-26 16:20:00 +00:00
Konstantin Zhuravlyov 1d99c4d03c [AMDGPU] Reserve VGPRs for trap handler usage if instructed
Differential Revision: http://reviews.llvm.org/D19235

llvm-svn: 267563
2016-04-26 15:43:14 +00:00
Nico Weber fa7f4898a9 Use gcc's rules for parsing gcc-style response files
In gcc, \ escapes every character in response files. It is true that this makes
it harder to mention Windows files in rsp files, but not doing this means clang
disagrees with gcc, and also disagrees with the shell (on non-Windows) which
rsp file quoting is supposed to match. clang isn't free to choose what to do
here.

In general, the idea for response files is to take bits of your command line
and write them to a file unchanged, and have things work the same way. Since
the command line would've been interpreted by the shell, things in the rsp file
need to be subject to the same shell quoting rules.

People who want to put Windows-style paths in their response files either need
to do any of:
* escape their backslashes
* or use clang-cl which uses cl.exe/cmd.exe quoting rules
* pass --rsp-quoting=windows to clang to tell it to use
  cl.exe/cmd.exe quoting rules for response files.

Fixes PR27464.
http://reviews.llvm.org/D19417

llvm-svn: 267556
2016-04-26 13:53:56 +00:00
Sam Kolton 3025e7f25f [AMDGPU] Assembler: basic support for SDWA instructions
Support for SDWA instructions for VOP1 and VOP2 encoding.
Not done yet:
  - converters for support optional operands and modifiers
  - VOPC
  - sext() modifier
  - intrinsics
  - VOP2b (see vop_dpp.s)
  - V_MAC_F32 (see vop_dpp.s)

Differential Revision: http://reviews.llvm.org/D19360

llvm-svn: 267553
2016-04-26 13:33:56 +00:00
Andrey Turetskiy b405606432 [X86] PR27502: Fix the LEA optimization pass.
Handle MachineBasicBlock as a memory displacement operand in the LEA optimization pass.

Differential Revision: http://reviews.llvm.org/D19409

llvm-svn: 267551
2016-04-26 12:18:12 +00:00
Marcin Koscielnicki 834381f19c [Sparc] Fix build error introduced by rL267545.
llvm-svn: 267549
2016-04-26 10:43:47 +00:00
Marcin Koscielnicki 0cfb612413 [PowerPC] Add support for llvm.thread.pointer
Differential Revision: http://reviews.llvm.org/D19304

llvm-svn: 267546
2016-04-26 10:37:22 +00:00
Marcin Koscielnicki 33571e2c41 [SPARC] [SSP] Add support for LOAD_STACK_GUARD.
This fixes PR22248 on sparc.

Differential Revision: http://reviews.llvm.org/D19386

llvm-svn: 267545
2016-04-26 10:37:14 +00:00
Marcin Koscielnicki fafb44951a [SPARC] Add support for llvm.thread.pointer.
Differential Revision: http://reviews.llvm.org/D19387

llvm-svn: 267544
2016-04-26 10:37:01 +00:00
Mehdi Amini aa309b1a81 ThinLTOCodeGenerator: preserve linkonce when in "MustPreserved" set
If the linker specifically requested for a linkonce to be preserved,
we need to make sure we won't drop it even if all the uses in the
current module disappear.

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 267543
2016-04-26 10:35:01 +00:00
Renato Golin 5a55a029c0 Revert "ARM: put correct symbol index on indirect pointers in __thread_ptr."
This reverts commit r267488, as it broke some ARM buildbots.

llvm-svn: 267541
2016-04-26 10:02:02 +00:00
Chuang-Yu Cheng 0600e8d759 [ppc64] Reenable sibling call optimization on ppc64 since fixed tsan library tail-call issue
print-stack-trace.cc test failure of compiler-rt has been fixed by
r266869 (http://reviews.llvm.org/D19148), so reenable sibling call
optimization on ppc64

Reviewers: nemanjai kbarton
llvm-svn: 267527
2016-04-26 07:38:24 +00:00
Sanjoy Das 65c133272e Align case statements (whitespace-only cleanup)
llvm-svn: 267525
2016-04-26 05:59:14 +00:00
Sanjoy Das 51df5fae4a Symbolize operand bundle blocks for bcanalyzer
Reviewers: joker.eph

Subscribers: mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D19523

llvm-svn: 267524
2016-04-26 05:59:08 +00:00
Craig Topper c5551bfc26 [AArch64] Expand v1i64 and v2i64 ctlz.
The default is legal, which results in 'Cannot select' errors.

llvm-svn: 267522
2016-04-26 05:26:51 +00:00
Craig Topper d8d6be4f99 [ARM] Expand vector ctlz_zero_undef so it becomes ctlz.
The default is Legal, which results in 'Cannot select' errors.

llvm-svn: 267521
2016-04-26 05:04:37 +00:00
Craig Topper edb4a6ba98 [ARM] Expand v1i64 and v2i64 ctlz.
The default is legal, which results in 'Cannot select' errors.

llvm-svn: 267520
2016-04-26 05:04:33 +00:00
Dehao Chen 5d6d4841ed Tune basic block annotation algorithm.
Summary:
Instead of using maximum IR weight as the basic block weight, this patch uses the voting algorithm to find the most likely weight for the basic block. This can effectively avoid the cases when some IRs are annotated incorrectly due to code motion of the profiled binary.

This patch also updates propagate.ll unittest to include discriminator in the input file so that it is testing something meaningful.

Reviewers: davidxl, dnovillo

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D19301

llvm-svn: 267519
2016-04-26 04:59:11 +00:00
Bill Seurer d6e92135bd [powerpc] mark JIT tests as UNSUPPORTED on powerpc64 big endian
Some of the JIT tests began failing with "[llvm] r266663 - [Orc] Re-commit 
r266581 with fixes for MSVC, and format cleanups." on powerpc64 big endian.  
To get the buildbots running I am marking these as UNSUPPORTED for now.

If this is fixed remove the UNSUPPORTED flag "powerpc64-unknown-linux-gnu".

In r267516 I marked these as XFAIL but they succeed on some of the bots
on stage1.

llvm-svn: 267518
2016-04-26 03:59:19 +00:00
Richard Trieu d7f31a31d1 Pass the test file in through stdin instead of by filename.
When passed in via filename, this test will fail if the path to the test
has the strings "f1" and "f2" in somewhere.  Pass the file through stdin
to prevent test failures due to coincidences in path names.

llvm-svn: 267517
2016-04-26 03:43:49 +00:00
Bill Seurer ab5171f988 [powerpc] mark JIT tests as XFAIL on powerpc64 big endian
Some of the JIT tests began failing with "[llvm] r266663 - [Orc] Re-commit 
r266581 with fixes for MSVC, and format cleanups." on powerpc64 big endian.  
To get the buildbots running I am marking these as XFAIL for now.

If this is fixed remove the XFAIL flag "powerpc64-unknown-linux-gnu".

llvm-svn: 267516
2016-04-26 02:33:22 +00:00
Hal Finkel e4c0c1679b [SimplifyCFG] Preserve !llvm.mem.parallel_loop_access when merging
When SimplifyCFG merges identical instructions from both sides of a diamond, it
can preserve !llvm.mem.parallel_loop_access (as it does with most of the other
metadata). There's no real data or control dependency change in this case.

llvm-svn: 267515
2016-04-26 02:06:06 +00:00
Hal Finkel 411d31ad72 [LoopVectorize] Don't consider conditional-load dereferenceability for marked parallel loops
I really thought we were doing this already, but we were not. Given this input:

void Test(int *res, int *c, int *d, int *p) {
  for (int i = 0; i < 16; i++)
    res[i] = (p[i] == 0) ? res[i] : res[i] + d[i];
}

we did not vectorize the loop. Even with "assume_safety" the check that we
don't if-convert conditionally-executed loads (to protect against
data-dependent deferenceability) was not elided.

One subtlety: As implemented, it will still prefer to use a masked-load
instrinsic (given target support) over the speculated load. The choice here
seems architecture specific; the best option depends on how expensive the
masked load is compared to a regular load. Ideally, using the masked load still
reduces unnecessary memory traffic, and so should be preferred. If we'd rather
do it the other way, flipping the order of the checks is easy.

The LangRef is updated to make explicit that llvm.mem.parallel_loop_access also
implies that if conversion is okay.

Differential Revision: http://reviews.llvm.org/D19512

llvm-svn: 267514
2016-04-26 02:00:36 +00:00
Lang Hames e246c4390e [lli] Fix a sign-compare warning.
llvm-svn: 267512
2016-04-26 01:45:25 +00:00
Dan Gohman f456290fca [WebAssembly] Account for implicit operands when computing operand indices.
llvm-svn: 267511
2016-04-26 01:40:56 +00:00
Lang Hames 2bcc9ad88c [ORC] Try to work around a GCC 4.7 bug triggered by r267457.
llvm-svn: 267510
2016-04-26 01:27:54 +00:00
David Majnemer 30ffc4ce45 [SROA] Don't falsely report that changes have occured
We would report that the function changed despite creating no new
allocas or performing any promotion.

This fixes PR27316.

llvm-svn: 267507
2016-04-26 01:05:00 +00:00
Andrew Kaylor 1aa3cf7d18 Reverting Thumb2SizeReduction opt bisect change to fix failing buildbots.
llvm-svn: 267506
2016-04-26 00:56:36 +00:00
Sanjay Patel a31b0c0ece [CodeGenPrepare] don't convert an unpredictable select into control flow
Suggested in the review of D19488:
http://reviews.llvm.org/D19488

llvm-svn: 267504
2016-04-26 00:47:39 +00:00
Junmo Park 3c65acf87e Remove MinLatency in SchedMachineModel. NFC.
Summary:
We don't use MinLatency any more since r184032.

Reviewers: atrick, hfinkel, mcrosier

Differential Revision: http://reviews.llvm.org/D19474

llvm-svn: 267502
2016-04-26 00:37:46 +00:00
Justin Bogner 1a07501379 PM: Port GlobalOpt to the new pass manager
llvm-svn: 267499
2016-04-26 00:28:01 +00:00
Justin Bogner d2f3d0a79d PM: Convert the logic for GlobalOpt into static functions. NFC
Pass all of the state we need around as arguments, so that these
functions are easier to reuse. There is one part of this that is
unusual: we pass around a functor to look up a DomTree for a function.
This will be a necessary abstraction when we try to use this code in
both the legacy and the new pass manager.

llvm-svn: 267498
2016-04-26 00:27:56 +00:00
Ahmed Bougacha 5cf735a5b1 [X86] Use LivePhysRegs in X86FixupBWInsts.
Kill-flags, which computeRegisterLiveness uses, are not reliable.
LivePhysRegs is.

Differential Revision: http://reviews.llvm.org/D19472

llvm-svn: 267495
2016-04-26 00:00:48 +00:00
Justin Bogner 6f6c5f2a02 GlobalOpt: Convert a bunch of tests from grep to FileCheck
llvm-svn: 267493
2016-04-25 23:36:50 +00:00
Sanjay Patel 82059090d3 Add check for "branch_weights" with prof metadata
While we're here, fix the comment and variable names to make it
clear that these are raw weights, not percentages.

llvm-svn: 267491
2016-04-25 23:15:16 +00:00
Chris Bieneman ed737d7881 [CMake] If set we should pass LLVM_VERSION_INFO into config.h
Autoconf used to support setting LLVM_VERSION_INFO and there is some code filtered around llvm in Support/CommandLine.cpp and LTO/LTOCodeGenerator.cpp that uses it if it is set.

We also shouldn't be explicitly setting it as a define on llvm-shlib. It is pointless there because there is no code using it in llvm-shlib, and it is better to have it as part of the generated config.h so that it is available everywhere.

llvm-svn: 267490
2016-04-25 23:02:47 +00:00
James Y Knight 51208eaccc [Sparc] Fix double-float fabs and fneg on little endian CPUs.
The SparcV8 fneg and fabs instructions interestingly come only in a
single-float variant. Since the sign bit is always the topmost bit no
matter what size float it is, you simply operate on the high
subregister, as if it were a single float.

However, the layout of double-floats in the float registers is reversed
on little-endian CPUs, so that the high bits are in the second
subregister, rather than the first.

Thus, this expansion must check the endianness to use the correct
subregister.

llvm-svn: 267489
2016-04-25 22:54:09 +00:00