Similar to horizontal ops on D56777, the sse2 (but not mmx) bit shift ops has local forwarding disabled, adding +1cy to the use latency for the result.
Differential Revision: https://reviews.llvm.org/D57026
llvm-svn: 351817
Similar to horizontal ops on D56777, the vpermilpd/vpermilps variable mask ops has local forwarding disabled, adding +1cy to the use latency for the result.
Differential Revision: https://reviews.llvm.org/D57022
llvm-svn: 351815
First step towards PR40376, this patch adds support for getCmpSelInstrCost to use the (optional) Instruction CmpInst predicate to indicate the type of integer comparison we're performing and alter the costs accordingly.
Differential Revision: https://reviews.llvm.org/D57013
llvm-svn: 351810
When we are inserting 1 "inline" element, and zeroing 2 of the other elements then we can safely commute the insertps source inputs to improve memory folding.
Differential Revision: https://reviews.llvm.org/D56843
llvm-svn: 351807
Avoid the infinite loop caused by the target DAG combine converting ANYEXT to
SIGNEXT and the target-independent DAG combine logic converting back to
ANYEXT. Do this by not adding the new node to the worklist.
Committing directly as this definitely doesn't make the problem any worse, and
I intend to follow-up with a patch that avoids this custom combiner logic
altogether and just lowers the i32 operations to a target-specific
SelectionDAG node. This should be easier to reason about and improve codegen
quality in some cases (though may miss out on some later DAG combines).
llvm-svn: 351806
This patch adds support of guards expressed as branches by widenable
conditions in Loop Predication.
Differential Revision: https://reviews.llvm.org/D56081
Reviewed By: reames
llvm-svn: 351805
This was requested in the review of D57006.
Also add missing quotes around symbol names in error messages.
Differential Revision: https://reviews.llvm.org/D57014
llvm-svn: 351799
This broke the RISCV build, and even with that fixed, one of the RISCV
tests behaves surprisingly differently with asserts than without,
leaving there no clear test pattern to use. Generally it seems bad for
hte IR to differ substantially due to asserts (as in, an alloca is used
with asserts that isn't needed without!) and nothing I did simply would
fix it so I'm reverting back to green.
This also required reverting the RISCV build fix in r351782.
llvm-svn: 351796
This patch adds a function to detect guards expressed in explicit control
flow form as branch by `and` with widenable condition intrinsic call:
%wc = call i1 @llvm.experimental.widenable.condition()
%guard_cond = and i1, %some_cond, %wc
br i1 %guard_cond, label %guarded, label %deopt
deopt:
<maybe some non-side-effecting instructions>
deoptimize()
This form can be used as alternative to implicit control flow guard
representation expressed by `experimental_guard` intrinsic.
Differential Revision: https://reviews.llvm.org/D56074
Reviewed By: reames
llvm-svn: 351791
In r287786, the behaviour of --dyn-symbols in llvm-readelf (but not
llvm-readobj) was changed to print the dynamic symbols as derived from
the hash table, rather than to print the dynamic symbol table contents
directly. The original change was initially submitted without review,
and some comments were made on the commit mailing list implying that the
new behavious is GNU compatible. I argue that it is not:
1) It does not include a null symbol.
2) It prints the symbols based on an order derived from the hash
table.
3) It prints an extra column indicating which bucket it came from.
This could break parsers that expect a fixed number of columns,
with the first column being the symbol index.
4) If the input happens to have both .hash and .gnu.hash section, it
prints interpretations of them both, resulting in most symbols
being printed twice.
5) There is no way of just printing the raw dynamic symbol table,
because --symbols also prints the static symbol table.
This patch reverts the --dyn-symbols behaviour back to its old behaviour
of just printing the contents of the dynamic symbol table, similar to
what is printed by --symbols. As the hashed interpretation is still
desirable to validate the hash table, it puts it under a new switch
"--hash-symbols". This is a no-op on all output forms except for GNU
output style for ELF. If there is no hash table, it does nothing,
unlike the previous behaviour which printed the raw dynamic symbol
table, since the raw dynsym is available under --dyn-symbols.
The yaml input for the test is based on that in
test/tools/llvm-readobj/demangle.test, but stripped down to the bare
minimum to provide a valid dynamic symbol.
Note: some LLD tests needed updating. I will commit a separate patch for
those.
Reviewed by: grimar, rupprecht
Differential Revision: https://reviews.llvm.org/D56910
llvm-svn: 351789
The break isn't strictly needed yet as there is no subsequent entry in the
case. But adding to prevent mistakes further down the road.
llvm-svn: 351785
This patch may seem familiar... but my previous patch handled the
equivalent lsls+and, not this case. Usually instcombine puts the
"and" after the shift, so this case doesn't come up. However, if the
shift comes out of a GEP, it won't get canonicalized by instcombine,
and DAGCombine doesn't have an equivalent transform.
This also modifies isDesirableToCommuteWithShift to suppress DAGCombine
transforms which would make the overall code worse.
I'm not really happy adding a bunch of code to handle this, but it would
probably be tricky to substantially improve the behavior of DAGCombine
here.
Differential Revision: https://reviews.llvm.org/D56032
llvm-svn: 351776
Deopt operands are generally intended to record information about a site in code with minimal perturbation of the surrounding code. Idiomatically, they also tend to appear down rare paths. Putting these together, we have an obvious case for extending CVP w/deopt operand constant folding. Arguably, we should be doing this for all operands on all instructions, but that's definitely a much larger and risky change.
Differential Revision: https://reviews.llvm.org/D55678
llvm-svn: 351774
Specifically, clarify the following:
1. Volatile load and store may access addresses that are not memory.
2. Volatile load and store do not modify arbitrary memory.
3. Volatile load and store do not trap.
Prompted by recent volatile discussion on llvmdev.
Currently, there's sort of a split in the source code about whether
volatile operations are allowed to trap; this resolves that dispute in
favor of not allowing them to trap.
Differential Revision: https://reviews.llvm.org/D53184
llvm-svn: 351772
Not sure this is the best fix, but it saves an instruction for certain
constructs involving variable shifts.
Differential Revision: https://reviews.llvm.org/D55572
llvm-svn: 351768
Summary:
Capture the current agreed-upon toolchain update policy based on the following
discussions:
- LLVM dev meeting 2018 BoF "Migrating to C++14, and beyond!"
llvm.org/devmtg/2018-10/talk-abstracts.html#bof3
- A Short Policy Proposal Regarding Host Compilers
lists.llvm.org/pipermail/llvm-dev/2018-May/123238.html
- Using C++14 code in LLVM (2018)
lists.llvm.org/pipermail/llvm-dev/2018-May/123182.html
- Using C++14 code in LLVM (2017)
lists.llvm.org/pipermail/llvm-dev/2017-October/118673.html
- Using C++14 code in LLVM (2016)
lists.llvm.org/pipermail/llvm-dev/2016-October/105483.html
- Document and Enforce new Host Compiler Policy
llvm.org/D47073
- Require GCC 5.1 and LLVM 3.5 at a minimum
llvm.org/D46723
Subscribers: jkorous, dexonsmith, llvm-commits
Differential Revision: https://reviews.llvm.org/D56819
llvm-svn: 351765
Summary:
Use X86ISD::VFPROUND in the instruction isel patterns. Add new patterns for ISD::FP_ROUND to maintain support for fptrunc in IR.
In the process I found a couple duplicate isel patterns which I also deleted in this patch.
Reviewers: RKSimon, spatel
Reviewed By: RKSimon
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D56991
llvm-svn: 351762
Summary:
For compress, a select node doesn't semantically reflect the behavior of the instruction. The mask would have holes in it, but the resulting write is to contiguous elements at the bottom of the vector.
Furthermore, as far as the compressing and expanding is concerned the behavior is depended on the mask. You can't just have an expand/compress node that only reads the input vector. That node would have no meaning by itself.
This all only works because we pattern match the compress/expand+select back to the instruction. But conceivably an optimization of the select could break the pattern and leave something meaningless.
This patch modifies the expand and compress node to take the mask and passthru as additional inputs and gets rid of the select all together.
Reviewers: RKSimon, spatel
Reviewed By: RKSimon
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D57002
llvm-svn: 351761
Fixes two problems with GCNHazardRecognizer:
1. It only scans up to 5 instructions emitted earlier.
2. It does not take control flow into account. An earlier instruction
from the previous basic block is not necessarily a predecessor.
At the same time a real predecessor block is not scanned.
The patch provides a way to distinguish between scheduler and
hazard recognizer mode. It is OK to work with emitted instructions
in the scheduler because we do not really know what will be emitted
later and its order. However, when pass works as a hazard recognizer
the schedule is already finalized, and we have full access to the
instructions for the whole function, so we can properly traverse
predecessors and their instructions.
Differential Revision: https://reviews.llvm.org/D56923
llvm-svn: 351759
This is a remnant from before the gn build had a working config.h.
Defining LLVM_LIBXML2_ENABLED only for targets that depend on build/libs/xml is
nice in that only some of the codebase needs to be rebuilt when
llvm_enable_libxml2 changes -- but config.h already defines it and defining it
there and then redundantly a second time for some targets is worse than having
it just in config.h.
No behavior change.
Differential Revision: https://reviews.llvm.org/D56908
llvm-svn: 351758
This version of gcc seems to be having issues with raw literals inside macro
arguments. I change the string to use regular string literals instead.
llvm-svn: 351756
D56777 added +1cy local forwarding penalty for horizontal operations, but this penalty only affects sse2/xmm variants, the mmx variants don't suffer the penalty.
Confirmed with @andreadb
llvm-svn: 351755
These are copied from the sibling x86 file. I'm not sure which
of the current outputs (if any) is considered optimal, but
someone more familiar with AArch may want to take a look.
llvm-svn: 351754
The regression test is reduced from the example shown in D56281.
This does raise a question as noted in the test file: do we want
to handle this pattern? I don't have a motivating example for
that on x86 yet, but it seems like we could have that pattern
there too, so we could avoid the back-and-forth using a shuffle.
llvm-svn: 351753
r327630 introduced new write definitions for float/vector loads.
Before that revision, WriteLoad was used by both integer/float (scalar/vector)
load. So, WriteLoad had to conservatively declare a latency to 5cy. That is
because the load-to-use latency for float/vector load is 5cy.
Now that we have dedicated writes for float/vector loads, there is no reason why
we should keep the latency of WriteLoad to 5cy. At the moment, WriteLoad is only
used by scalar integer loads only; we can assume an optimstic 3cy latency for
them.
This patch changes that latency from 5cy to 3cy, and regenerates the affected
scheduling/mca tests.
Differential Revision: https://reviews.llvm.org/D56922
llvm-svn: 351742
all missed!
Thanks to Alex Bradbury for pointing this out, and the fact that I never
added the intended `legacy` anchor to the developer policy. Add that
anchor too. With hope, this will cause the links to all resolve
successfully.
llvm-svn: 351731