Fixes https://bugs.llvm.org/show_bug.cgi?id=39926.
The size of the first copy was computed as
std::abs(std::abs(LdDisp2) - std::abs(LdDisp1)), which results in
skipped bytes if the signs of LdDisp2 and LdDisp1 differ. As far as
I can see, this should just be LdDisp2 - LdDisp1. The case where
LdDisp1 > LdDisp2 is already handled in the code above, in which case
LdDisp2 is set to LdDisp1 and this subtraction will evaluate to
Size1 = 0, which is the correct value to skip an overlapping copy.
Differential Revision: https://reviews.llvm.org/D55485
llvm-svn: 348750
Both intrinsics do the exact same thing so we really only need one.
Earlier in the 8.0 cycle we changed the signature of this intrinsic without renaming it. But it looks difficult to get the autoupgrade code to allow me to merge the intrinsics and change the signature at the same time. So I've renamed the intrinsic slightly for the new merged intrinsic. I'm skipping autoupgrading from the previous new to 8.0 signature. I've also renamed the subborrow for consistency.
llvm-svn: 348737
Summary:
`llvm::AttributeList` and `llvm::AttributeSet` are immutable, and so methods
defined on these classes, such as `addAttribute`, return a new immutable
object with the attribute added. In https://reviews.llvm.org/D55217 I attempted
to annotate methods such as `addAttribute` with `LLVM_NODISCARD`, since
calling these methods has no side-effects, and so ignoring the result
that is returned is almost certainly a programmer error.
However, committing the change resulted in new warnings in the AMDGPU target.
The AMDGPU simplify libcalls pass added in https://reviews.llvm.org/D36436
attempts to add the readonly and nounwind attributes to simplified
library functions, but instead calls the `addAttribute` methods and
ignores the result.
Modify the simplify libcalls pass to actually add the nounwind and
readonly attributes. Also update the simplify libcalls test to assert
that these attributes are actually being set.
Reviewers: rampitec, vpykhtin, rnk
Reviewed By: rampitec
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Differential Revision: https://reviews.llvm.org/D55435
llvm-svn: 348732
Previously we had to take the carry in and add -1 to it to set the carry flag so we could use it with ADC/SBB. But if we know its 0 then we don't need to bother.
This should go a long way towards fixing PR24545.
llvm-svn: 348727
The dependency was added in r213995 in response to r213986 which did make
X86/Utils depend on IR, but r256680 later removed that dependency again.
llvm-svn: 348724
The existing code tries to handle an undef operand while transforming an add to an LEA,
but it's incomplete because we will crash on the i16 test with the debug output shown below.
It's better to just give up instead. Really, GlobalIsel should have folded these before we
could get into trouble.
# Machine code for function add_undef_i16: NoPHIs, TracksLiveness, Legalized, RegBankSelected, Selected
bb.0 (%ir-block.0):
liveins: $edi
%1:gr32 = COPY killed $edi
%0:gr16 = COPY %1.sub_16bit:gr32
%5:gr64_nosp = IMPLICIT_DEF
%5.sub_16bit:gr64_nosp = COPY %0:gr16
%6:gr64_nosp = IMPLICIT_DEF
%6.sub_16bit:gr64_nosp = COPY %2:gr16
%4:gr32 = LEA64_32r killed %5:gr64_nosp, 1, killed %6:gr64_nosp, 0, $noreg
%3:gr16 = COPY killed %4.sub_16bit:gr32
$ax = COPY killed %3:gr16
RET 0, implicit killed $ax
# End machine code for function add_undef_i16.
*** Bad machine code: Reading virtual register without a def ***
- function: add_undef_i16
- basic block: %bb.0 (0x7fe6cd83d940)
- instruction: %6.sub_16bit:gr64_nosp = COPY %2:gr16
- operand 1: %2:gr16
LLVM ERROR: Found 1 machine code errors.
Differential Revision: https://reviews.llvm.org/D54710
llvm-svn: 348722
Extension to rL348617, turns out llvm-exegesis doesn't need to match the perf counter name against a scheduler model resource name - so I've added a few more counters that I could find in the libpfm4 source code (and fix a typo in the knl/knm retired_uops counter - which uses 'all' instead of 'any').
llvm-svn: 348721
We were still using the rounded down offset and alignment even though
they aren't handled because you can't trivially bitcast the loaded
value.
llvm-svn: 348658
Summary:
- LLVM clang-format style doesn't allow one-line ifs.
- LLVM clang-tidy style says method names should start with a lowercase
letter. But currently WebAssemblyAsmParser's parent class
MCTargetAsmParser is mixing lowercase and uppercase method names
itself so overridden methods cannot be renamed now.
- Changed else ifs after returns to ifs.
- Added some newlines for readability.
Reviewers: aardappel, sbc100
Subscribers: dschuff, jgravelle-google, sunfish, llvm-commits
Differential Revision: https://reviews.llvm.org/D55350
llvm-svn: 348648
To make X86CondBrFoldingPass can be run with --run-pass option, this can test one wrong assertion on analyzeCompare function for SUB32ri when its operand is not imm
Patch by Jianping Chen
Differential Revision: https://reviews.llvm.org/D55412
llvm-svn: 348620
This patch attempts to improve pfm perf counter coverage for all the x86 CPUs that libpfm4 supports.
Intel/AMD CPU families tend to share names for cycle/uops counters so even if they don't have a scheduler model yet they can at least use the default values (checked against the libpfm4 source code).
The remaining CPUs (where their port/pipe resource counters are known) I've tried to add to the existing model mappings.
These are untested but don't represent a regression to current llvm-exegesis behaviour for these CPUs.
Differential Revision: https://reviews.llvm.org/D55432
llvm-svn: 348617
This change attempts to shrink scalar AND, OR and XOR instructions which take an immediate that isn't inlineable.
It performs:
AND s0, s0, ~(1 << n) -> BITSET0 s0, n
OR s0, s0, (1 << n) -> BITSET1 s0, n
AND s0, s1, x -> ANDN2 s0, s1, ~x
OR s0, s1, x -> ORN2 s0, s1, ~x
XOR s0, s1, x -> XNOR s0, s1, ~x
In particular, this catches setting and clearing the sign bit for fabs (and x, 0x7ffffffff -> bitset0 x, 31 and or x, 0x80000000 -> bitset1 x, 31).
llvm-svn: 348601
When we had dynamic call frames (i.e. sp adjustment around each call) we
were including that adjustment into offsets calculated based on r6, even
though it's only sp that changes. This led to incorrect stack slot
accesses.
llvm-svn: 348591
Adds fatal errors for any target that does not support the Tiny or Kernel
codemodels by rejigging the getEffectiveCodeModel calls.
Differential Revision: https://reviews.llvm.org/D50141
llvm-svn: 348585
Fix assert about using an undefined physical register in machine instruction verify pass.
The reason is that register flag undef is missing when doing transformation from If Conversion Pass.
```
Bad machine code: Using an undefined physical register
- function: func_65
- basic block: %bb.0 entry (0x10024740738)
- instruction: BCLR killed $cr5lt, implicit $lr8, implicit $rm, implicit undef $x3
- operand 0: killed $cr5lt
LLVM ERROR: Found 1 machine code errors.
```
There are also other existing testcases with same issue. So I add -verify-machineinstrs option to open verifying.
Differential Revision: https://reviews.llvm.org/D55408
llvm-svn: 348566
This addresses a FIXME and avoids depending on an isel pattern match I think. I've remove the isel patterns too since he have no lit tests left that cover them. Hopefully that really means they are unused.
I'm trying to decide if we need SETCC_CARRY. This removes one of its usages.
Differential Revision: https://reviews.llvm.org/D55355
llvm-svn: 348536
Initial step towards making the function more generic (and probably move into SelectionDAG).
This is necessary to avoid massive codegen bloat for PR38243 (Add modulo rotate support to LowerRotate).
llvm-svn: 348498
Summary:
If the output of debug directives only is requested, we should drop
emission of ',debug' option from the target directive. Required for
supporting of nvprof profiler.
Reviewers: echristo
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D46061
llvm-svn: 348497
Summary:
We may end up with not emitted debug directives at the end of the module
emission. Patch fixes this problem emitting those last directives the
end of the module emission.
Reviewers: echristo
Subscribers: jholewinski, llvm-commits
Differential Revision: https://reviews.llvm.org/D54320
llvm-svn: 348495
This patch splits backend features currently
hidden behind architecture versions.
For example, currently the only way to activate
complex numbers extension is targeting an v8.3
architecture, where after the patch this extension
can be added separately.
This refactoring is required by the new command lines proposal:
http://lists.llvm.org/pipermail/llvm-dev/2018-September/126346.html
Reviewers: DavidSpickett, olista01, t.p.northover
Subscribers: kristof.beyls, bryanpkc, javed.absar, pbarrio
Differential revision: https://reviews.llvm.org/D54633
--
It was reverted in rL348249 due a build bot failure in one of the
regression tests:
http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-win/builds/14386
The problem seems to be that FileCheck behaves
different in windows and linux. This new patch
splits the test file in multiple,
and does more exact pattern matching attempting
to circumvent the issue.
llvm-svn: 348493
...yet!
A lot of the current code should be shared for arm and thumb mode, but
until we add tests and work out some of the details (e.g. checking the
correct subtarget feature for G_SDIV) it's safer to bail out as early as
possible for thumb targets.
This should have arguably been part of r348347, which allowed Thumb
functions to be handled by the IR Translator.
llvm-svn: 348472
The code emitting AND-subtrees used to check whether any of the operands
was an OR in order to figure out if the result needs to be negated.
However the OR could be hidden in further subtrees and not immediately
visible.
Change the code so that canEmitConjunction() determines whether the
result of the generated subtree needs to be negated. Cleanup emission
logic to use this. I also changed the code a bit to make all negation
decisions early before we actually emit the subtrees.
This fixes http://llvm.org/PR39550
Differential Revision: https://reviews.llvm.org/D54137
llvm-svn: 348444
https://reviews.llvm.org/D54980
This provides a standard API across GISel passes to observe and notify
passes about changes (insertions/deletions/mutations) to MachineInstrs.
This patch also removes the recordInsertion method in MachineIRBuilder
and instead provides method to setObserver.
Reviewed by: vkeles.
llvm-svn: 348406
Whenever we effectively take the address of a basic block we need to
manually update that basic block to reflect that fact or later passes
such as tail duplication and tail merging can break the invariants of
the code. =/ Sadly, there doesn't appear to be any good way of
automating this or even writing a reasonable assert to catch it early.
The change seems trivially and obviously correct, but sadly the only
really good test case I have is 1000s of basic blocks. I've tried
directly writing a test case that happens to make tail duplication do
something that crashes later on, but this appears to require an
*amazingly* complex set of conditions that I've not yet reproduced.
The change is technically covered by the tests because we mark the
blocks as having their address taken, but that doesn't really count as
properly testing the functionality.
llvm-svn: 348374
Prep work for PR38243 - mainly adding comments on where we need to add modulo support (doing so at the moment causes massive codegen regressions).
I've also consistently added support for modulo folding for uniform constants (although at the moment we have no way to trigger this) and removed the old assertions.
llvm-svn: 348366
This is an initial patch to add a minimum level of support for funnel shifts to the SelectionDAG and to begin wiring it up to the X86 SHLD/SHRD instructions.
Some partial legalization code has been added to handle the case for 'SlowSHLD' where we want to expand instead and I've added a few DAG combines so we don't get regressions from the existing DAG builder expansion code.
Differential Revision: https://reviews.llvm.org/D54698
llvm-svn: 348353
Functions annotated with `__fastcall` or `__attribute__((__fastcall__))`
or `__attribute__((__swiftcall__))` may contain SEH handlers even on
Win64. This matches the behaviour of cl which allows for
`__try`/`__except` inside a `__fastcall` function. This was detected
while trying to self-host clang on Windows ARM64.
llvm-svn: 348337
We previously disabled this in r323371 because of a bug where we selected an
extending load, but didn't delete the old G_LOAD, resulting in two loads being
generated for volatile loads.
Since we now have dedicated G_SEXTLOAD/G_ZEXTLOAD operations, and that the
tablegen patterns should no longer be able to select (ext(load x)) patterns, it
should be safe to re-enable it.
The old test case should still work as expected.
llvm-svn: 348320