This updates the AVR Select8/Select16 expansion code so that, when
inserting the two basic blocks for true and false conditions, any
existing fallthrough on the previous block is preserved.
Prior to this patch, if the block before the Select pseudo fell through
to the subsequent block, two new basic blocks would be inserted at the
prior fallthrough point, changing the fallthrough destination.
The predecessor or successor lists were not updated, causing the
BranchFolding pass at -O1 and above the rearrange basic blocks, causing
an infinite loop. Not to mention the unconditional fallthrough to the
true block is incorrect in of itself.
This patch modifies the Select8/16 expansion so that, if inserting true
and false basic blocks at a fallthrough point, the implicit branch is
preserved by means of an explicit, unconditional branch to the previous
fallthrough destination.
Thanks to Carl Peto for reporting this bug.
This fixes avr-rust bug https://github.com/avr-rust/rust/issues/123.
llvm-svn: 351721
This reverts commit r351718.
Carl pointed out that the unit test could be improved.
This patch will be recommitted once the test is made more resilient.
llvm-svn: 351719
This updates the AVR Select8/Select16 expansion code so that, when
inserting the two basic blocks for true and false conditions, any
existing fallthrough on the previous block is preserved.
Prior to this patch, if the block before the Select pseudo fell through
to the subsequent block, two new basic blocks would be inserted at the
prior fallthrough point, changing the fallthrough destination.
The predecessor or successor lists were not updated, causing the
BranchFolding pass at -O1 and above the rearrange basic blocks, causing
an infinite loop. Not to mention the unconditional fallthrough to the
true block is incorrect in of itself.
This patch modifies the Select8/16 expansion so that, if inserting true
and false basic blocks at a fallthrough point, the implicit branch is
preserved by means of an explicit, unconditional branch to the previous
fallthrough destination.
Thanks to Carl Peto for reporting this bug.
This fixes avr-rust bug https://github.com/avr-rust/rust/issues/123.
llvm-svn: 351718
There is a combine that was hiding these tests
not actually testing what they should be, although
they were producing the expected end result.
llvm-svn: 351698
This causes a couple of changes in the upgrade tests as signed/unsigned eq/ne are equivalent and we constant fold true/false codes, these changes are the same as what we already do for avx512 cmp/ucmp.
Noticed while cleaning up vector integer comparison costs for PR40376.
llvm-svn: 351697
This was crashing in the predicate function assuming the value
is a vector.
Copy more of what AArch64 uses. This probably needs more refinement
later, but I don't exactly understand what it means in some cases,
particularly since any legalization for these seems to be missing.
llvm-svn: 351693
We were upgrading these to the new style VPCOM/VPCOMU intrinsics (which includes the condition code immediate), but we'll be getting rid of those shortly, so convert these to generics first.
This causes a couple of changes in the upgrade tests as signed/unsigned eq/ne are equivalent and we constant fold true/false codes, these changes are the same as what we already do for avx512 cmp/ucmp.
Noticed while cleaning up vector integer comparison costs for PR40376.
llvm-svn: 351690
These intrinsics can always be replaced with generic integer comparisons without any regression in codegen, even for -O0/-fast-isel cases.
Noticed while cleaning up vector integer comparison costs for PR40376.
A future commit will remove/autoupgrade the existing VPCOM/VPCOMU llvm intrinsics.
llvm-svn: 351688
Prior to SSE41 (and sometimes on AVX1), vector select has to be performed as a ((X & C)|(Y & ~C)) bit select.
Exposes a couple of issues with the min/max reduction costs (which only go down to SSE42 for some reason).
The increase pre-SSE41 selection costs also prevent a couple of tests from firing any longer, so I've either tweaked the target or added AVX tests as well to the existing SSE2 tests.
llvm-svn: 351685
Prior to this patch, the AVR::LDWRdPtr instruction was always lowered to
instructions of this pattern:
ld $GPR8, [PTR:XYZ]+
ld $GPR8, [PTR]+1
This has a problem; the [PTR] is incremented in-place once, but never
decremented.
Future uses of the same pointer will use the now clobbered value,
leading to the pointer being incorrect by an offset of one.
This patch modifies the expansion code of the LDWRdPtr pseudo
instruction so that the pointer variable is not silently clobbered in
future uses in the same live range.
Bug first reported by Keshav Kini.
Patch by Kaushik Phatak.
llvm-svn: 351673
This reverts commit r351544.
In that commit, I had mistakenly misattributed the issue submitter as
the patch author, Kaushik Phatak.
The patch will be recommitted immediately with the correct attribution.
llvm-svn: 351672
The debug directory contains the rwa file address of itself,
which is updated on write. Add a testcase for this existing
functionality.
Differential Revision: https://reviews.llvm.org/D56876
llvm-svn: 351659
Followup to D55745, this time handling comparisons with ugt and ult
predicates (which are the canonical forms for non-equality predicates).
For ctlz we can convert into a simple icmp, for cttz we can convert
into a mask check.
Differential Revision: https://reviews.llvm.org/D56355
llvm-svn: 351645
This modification of the currently unused inter-procedural constant
propagation pass (IPConstantPropagation) shows how abstract call sites
enable optimization of callback calls alongside direct and indirect
calls. Through minimal changes, mostly dealing with the partial mapping
of callbacks, inter-procedural constant propagation was enabled for
callbacks, e.g., OpenMP runtime calls or pthreads_create.
Differential Revision: https://reviews.llvm.org/D56447
llvm-svn: 351628
An abstract call site is a wrapper that allows to treat direct,
indirect, and callback calls the same. If an abstract call site
represents a direct or indirect call site it behaves like a stripped
down version of a normal call site object. The abstract call site can
also represent a callback call, thus the fact that the initially
called function (=broker) may invoke a third one (=callback callee).
In this case, the abstract call side hides the middle man, hence the
broker function. The result is a representation of the callback call,
inside the broker, but in the context of the original instruction that
invoked the broker.
Again, there are up to three functions involved when we talk about
callback call sites. The caller (1), which invokes the broker
function. The broker function (2), that may or may not invoke the
callback callee. And finally the callback callee (3), which is the
target of the callback call.
The abstract call site will handle the mapping from parameters to
arguments depending on the semantic of the broker function. However,
it is important to note that the mapping is often partial. Thus, some
arguments of the call/invoke instruction are mapped to parameters of
the callee while others are not. At the same time, arguments of the
callback callee might be unknown, thus "null" if queried.
This patch introduces also !callback metadata which describe how a
callback broker maps from parameters to arguments. This metadata is
directly created by clang for known broker functions, provided through
source code attributes by the user, or later deduced by analyses.
For motivation and additional information please see the corresponding
talk (slides/video)
https://llvm.org/devmtg/2018-10/talk-abstracts.html#talk20
as well as the LCPC paper
http://compilers.cs.uni-saarland.de/people/doerfert/par_opt_lcpc18.pdf
Differential Revision: https://reviews.llvm.org/D54498
llvm-svn: 351627
Thanks to Nikita Popov for pointing out this missed case.
This is a follow-up to r351411, which disabled function merging for
vararg functions outright due to a miscompile (see llvm.org/PR40345).
Differential Revision: https://reviews.llvm.org/D56865
llvm-svn: 351624
If an inherently cold function is found, mark it as cold. For now this
means applying the `cold` and `minsize` attributes.
As a drive-by, revisit and clean up the criteria for considering a
function for splitting. Add tests.
llvm-svn: 351623
CodeExtractor permits extracting a region of blocks from a function even
when values defined within the region are used outside of it.
This is typically done by creating an alloca in the original function
and reloading the alloca after a call to the extracted function.
Wrap the reload in lifetime start/end markers to promote stack coloring.
Suggested by Sergei Kachkov!
Differential Revision: https://reviews.llvm.org/D56045
llvm-svn: 351621
This reverts commit r351618.
Compiler RT + ASAN tests are failing for PowerPC. Not sure
how would I reproduce these on macOS, so reverting (again)
until I do.
llvm-svn: 351619
r291284 added a nice mechanism to consistently pass CMake on/off toggles to
lit. This change uses it for LLVM_LIBXML2_ENABLED too (which was added around
the same time and doesn't use the new system yet).
Also alphabetically sort the list passed to llvm_canonicalize_cmake_booleans()
in llvm/test/CMakeLists.txt.
No intended behavior change.
Differential Revision: https://reviews.llvm.org/D56912
llvm-svn: 351615
This patch gives elfabi the ability to read DT_NEEDED entries from ELF binaries
to populate NeededLibs in TextAPI's ELFStub.
Differential Revision: https://reviews.llvm.org/D55852
llvm-svn: 351592
The existing tests already show a sub-optimal transform,
but this should make it clear that we can't just match
an 'and' op when creating movmsk instructions.
llvm-svn: 351590