Previously we were recursing on our operands for unary and binary operators regardless of whether we knew how to reason about the operator in question. This has the effect of doing a potentially large amount of work, only to throw it away. By checking whether the operation is one LVI can handle, we can cut short the search and return the (overdefined) answer more quickly. The quality of the results produced should not change.
llvm-svn: 267626
As pointed out by John Regehr over in http://reviews.llvm.org/D19485, LVI was being incredibly stupid about applying its transfer rules. Rather than gathering local facts from the expression itself, it was simply giving up entirely if one of the inputs was overdefined. This greatly impacts the precision of the overall analysis and makes it far more fragile as well.
This patch builds on 267609 which did the same thing for unary casts.
llvm-svn: 267620
NVPTXLowerKernelArgs is required for correctness, so it should not be guarded
by CodeGenOpt::None.
NVPTXPeephole is optimization only, so it should be skipped when
CodeGenOpt::None.
llvm-svn: 267619
Essentially, I was using the wrong size function. For types which were sized, but not primitive, I wasn't getting a useful size for the operand and failed an assert. I fixed this, and also added a guard that the input is a sized type. Test case is for the original mistake. I'm not sure how to actually exercise the sized type check.
llvm-svn: 267618
We need the default ratio to be sufficiently large that it triggers transforms
based on block frequency info (BFI) and plays well with the recently introduced
BranchProbability used by CGP.
Differential Revision: http://reviews.llvm.org/D19435
llvm-svn: 267615
As pointed out by John Regehr over in http://reviews.llvm.org/D19485, LVI was being incredibly stupid about applying its transfer rules. Rather than gathering local facts from the expression itself, it was simply giving up entirely if one of the inputs was overdefined. This greatly impacts the precision of the overall analysis and makes it far more fragile as well.
This patch implements only the unary operation case. Once this is in, I'll implement the same for the binary operations.
Differential Revision: http://reviews.llvm.org/D19492
llvm-svn: 267609
The destination buffer that sprintf uses is restrict qualified, we do
not need to worry about derived pointers referenced via format
specifiers.
This reverts commit r267580.
llvm-svn: 267605
The DBI stream contains a lot of bookkeeping information for other
streams. In particular it contains information about section contributions
and linked modules. This patch is a first attempt at parsing some of the
information out of the DBI stream. It currently only parses and dumps the
headers of the DBI stream, so none of the module data or section
contribution data is pulled out.
This is just a proof of concept that we understand the basic properties of
the DBI stream's metadata, and followup patches will try to extract more
detailed information out.
Differential Revision: http://reviews.llvm.org/D19500
Reviewed By: majnemer, ruiu
llvm-svn: 267585
When a block is tail-duplicated, the PHI nodes from that block are
replaced with appropriate COPY instructions. When those PHI nodes
contained use operands with subregisters, the subregisters were
dropped from the COPY instructions, resulting in incorrect code.
Keep track of the subregister information and use this information
when remapping instructions from the duplicated block.
Differential Revision: http://reviews.llvm.org/D19337
llvm-svn: 267583
sprintf doesn't read or copy the terminating null byte from it's string
operands. sprintf will append it's own after processing all of the
format specifiers.
This fixes PR27526.
llvm-svn: 267580
The Machine Instruction Verifier flagged some issues in the serialized MIR.
Adjust the input to correct them.
Fixes the remaining portion of PR27480.
llvm-svn: 267578
This is part of solving PR27344:
https://llvm.org/bugs/show_bug.cgi?id=27344
CGP should undo the SimplifyCFG transform for the same reason that earlier patches have used this
same mechanism: it's possible that passes between SimplifyCFG and CGP may be able to optimize the
IR further with a select in place.
For the TLI hook default, >99% taken or not taken is chosen as the default threshold for a highly
predictable branch. Even the most limited HW branch predictors will be correct on this branch almost
all the time, so even a massive mispredict penalty perf loss would be overcome by the win from all
the times the branch was predicted correctly.
As a follow-up, we could make the default target hook less conservative by using the SchedMachineModel's
MispredictPenalty. Or we could just let targets override the default by implementing the hook with that
and other target-specific options. Note that trying to statically determine mispredict rates for
close-to-balanced profile weight data is generally impossible if the HW is sufficiently advanced. Ie,
50/50 taken/not-taken might still be 100% predictable.
Finally, note that this patch as-is will not solve PR27344 because the current __builtin_unpredictable()
branch weight default values are 4 and 64. A proposal to change that is in D19435.
Differential Revision: http://reviews.llvm.org/D19488
llvm-svn: 267572
In gcc, \ escapes every character in response files. It is true that this makes
it harder to mention Windows files in rsp files, but not doing this means clang
disagrees with gcc, and also disagrees with the shell (on non-Windows) which
rsp file quoting is supposed to match. clang isn't free to choose what to do
here.
In general, the idea for response files is to take bits of your command line
and write them to a file unchanged, and have things work the same way. Since
the command line would've been interpreted by the shell, things in the rsp file
need to be subject to the same shell quoting rules.
People who want to put Windows-style paths in their response files either need
to do any of:
* escape their backslashes
* or use clang-cl which uses cl.exe/cmd.exe quoting rules
* pass --rsp-quoting=windows to clang to tell it to use
cl.exe/cmd.exe quoting rules for response files.
Fixes PR27464.
http://reviews.llvm.org/D19417
llvm-svn: 267556
Support for SDWA instructions for VOP1 and VOP2 encoding.
Not done yet:
- converters for support optional operands and modifiers
- VOPC
- sext() modifier
- intrinsics
- VOP2b (see vop_dpp.s)
- V_MAC_F32 (see vop_dpp.s)
Differential Revision: http://reviews.llvm.org/D19360
llvm-svn: 267553
Handle MachineBasicBlock as a memory displacement operand in the LEA optimization pass.
Differential Revision: http://reviews.llvm.org/D19409
llvm-svn: 267551
If the linker specifically requested for a linkonce to be preserved,
we need to make sure we won't drop it even if all the uses in the
current module disappear.
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 267543
print-stack-trace.cc test failure of compiler-rt has been fixed by
r266869 (http://reviews.llvm.org/D19148), so reenable sibling call
optimization on ppc64
Reviewers: nemanjai kbarton
llvm-svn: 267527