Build vectors have magical truncation powers, so we have things like this:
v4i1 = BUILD_VECTOR Constant:i32<1>, Constant:i32<1>, Constant:i32<1>, Constant:i32<1>
v4i16 = BUILD_VECTOR Constant:i32<1>, Constant:i32<1>, Constant:i32<1>, Constant:i32<1>
If we don't truncate the splat node returned by getConstantSplatNode(), then we won't find
truth when ZeroOrNegativeOneBooleanContent is the rule.
Differential Revision: https://reviews.llvm.org/D32505
llvm-svn: 301408
This patch reapplies r301309 with the fix to the MIR test to fix the assertion
triggered by r301309. Had trimmed a little bit too much from the MIR!
llvm-svn: 301317
This patch fixes a bug with the updating of DBG_VALUE's in
BreakAntiDependencies. Previously, it would only attempt to update the first
DBG_VALUE following the instruction whose register is being changed,
potentially leaving DBG_VALUE's referring to the wrong register. Now the code
will update all DBG_VALUE's that immediately follow the instruction.
This issue was detected as a result of an optimized codegen difference with
"-g" where an X86 byte/word fixup was not performed due to a DBG_VALUE
referencing the wrong register.
Differential Revision: https://reviews.llvm.org/D31755
llvm-svn: 301309
I'm proposing a fold for increment-of-sexted-bool in:
https://reviews.llvm.org/D31944
...so we need to know what happens in more cases like these.
llvm-svn: 301269
When functions are terminated by unreachable instructions, the last
instruction might trigger a CFI instruction to be generated. However,
emitting it would be be illegal since the function (and thus the FDE
the CFI is in) has already ended with the previous instruction.
Darwin's dwarfdump --verify --eh-frame complains about this and the
specification supports this.
Relevant bits from the DWARF 5 standard (6.4 Call Frame Information):
"[The] address_range [field in an FDE]: The number of bytes of
program instructions described by this entry."
"Row creation instructions: [...]
The new location value is always greater than the current one."
The first quotation implies that a CFI cannot describe a target
address outside of the enclosing FDE's range.
rdar://problem/26244988
Differential Revision: https://reviews.llvm.org/D32246
llvm-svn: 301219
While we use BaseIndexOffset in FindBetterNeighborChains to
appropriately realize they're almost the same address and should be
improved concurrently we do not use it in isAlias using the non-index
understanding FindBaseOffset instead. Adding a BaseIndexOffset check
in isAlias like should allow indexed stores to be merged.
FindBaseOffset to be excised in subsequent patch.
Reviewers: jyknight, aditya_nandakumar, bogner
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D31987
llvm-svn: 301187
Since Split DWARF needs to name the actual .dwo file that is generated,
it can't be known at the time the llvm::Module is produced as it may be
merged with other Modules before the object is generated and that object
may be generated with any name.
By passing the Split DWARF file name when LLVM is producing object code
the .dwo file name in the object file can match correctly.
The support for Split DWARF for implicit modules remains the same -
using metadata to store the dwo name and dwo id so that potentially
multiple skeleton CUs referring to different dwo files can be generated
from one llvm::Module.
llvm-svn: 301062
In addition to the original commit, tighten the condition for when to
pad empty functions to COFF Windows. This avoids running into problems
when targeting e.g. Win32 AMDGPU, which caused test failures when this
was committed initially.
llvm-svn: 301047
Empty functions can lead to duplicate entries in the Guard CF Function
Table of a binary due to multiple functions sharing the same RVA,
causing the kernel to refuse to load that binary.
We had a terrific bug due to this in Chromium.
It turns out we were already doing this for Mach-O in certain
situations. This patch expands the code for that in
AsmPrinter::EmitFunctionBody() and renames
TargetInstrInfo::getNoopForMachoTarget() to simply getNoop() since it
seems it was used for not just Mach-O anyway.
Differential Revision: https://reviews.llvm.org/D32330
llvm-svn: 301040
places based on it.
Existing constant hoisting pass will merge a group of contants in a small range
and hoist the const materialization code to the common dominator of their uses.
However, if the uses are all in cold pathes, existing implementation may hoist
the materialization code from cold pathes to a hot place. This may hurt performance.
The patch introduces BFI to the pass and selects the best insertion places based
on it.
The change is controlled by an option consthoist-with-block-frequency which is
off by default for now.
Differential Revision: https://reviews.llvm.org/D28962
llvm-svn: 300989
when the subtarget has fast strings.
This has two advantages:
- Speed is improved. For example, on Haswell thoughput improvements increase
linearly with size from 256 to 512 bytes, after which they plateau:
(e.g. 1% for 260 bytes, 25% for 400 bytes, 40% for 508 bytes).
- Code is much smaller (no need to handle boundaries).
llvm-svn: 300957
Debug information is calculated with getFrameIndexReference() which was
missing some logic for the fixed object cases (= parameters on the stack).
rdar://24557797
Differential Revision: https://reviews.llvm.org/D32204
llvm-svn: 300781
I've changed one of the tests to not fold away, but we didn't and still don't do the transform
that the comment claims we do (and I don't know why we'd want to do that).
Follow-up to:
https://reviews.llvm.org/rL300725https://reviews.llvm.org/rL300763
llvm-svn: 300772
This allows forming more 'not' ops, so we get improvements for ISAs that have and-not.
Follow-up to:
https://reviews.llvm.org/rL300725
llvm-svn: 300763
The patch itself is simple: stop discriminating against vectors in visitAnd() and again in
SimplifyDemandedBits().
Some notes for reference:
1. We're not consistent about calls to SimplifyDemandedBits in the various visitXXX functions.
Sometimes, we check if the RHS is a constant first. Other times (like here), we just dive in.
2. I'd like to break the vector shackles in steps for the sake of risk minimization, but we could
make similar simultaneous changes in other places if we think that would be better.
3. I don't know what the intent of the changed tests in this patch was supposed to be, but since
they wiggled in a positive way, I'm just going with that. :)
4. In the rotate tests, note that we can see through non-splat constants. This is a result of D24253.
5. My motivation for being here now is to make D31944 look better, so this is step 1 of N towards
improving the vector codegen in that patch without writing any actual new code.
Differential Revision: https://reviews.llvm.org/D32230
llvm-svn: 300725