Summary: As per title. This will add the instructiions we are interested in in the worklist.
Reviewers: mehdi_amini, majnemer, andreadb
Differential Revision: https://reviews.llvm.org/D29081
llvm-svn: 292957
Regalloc creates COPY instructions which do not formally use VALU.
That results in v_mov instructions displaced after exec mask modification.
One pass which do it is SIOptimizeExecMasking, but potentially it can be
done by other passes too.
This patch adds a pass immediately after regalloc to add implicit exec
use operand to all VGPR copy instructions.
Differential Revision: https://reviews.llvm.org/D28874
llvm-svn: 292956
In order to follow the pattern of the existing 'slow-misaligned-128store'
option, rename the option 'no-quad-ldst-pairs' to 'slow-paired-128'.
llvm-svn: 292954
Summary:
This is in keeping with LLVM convention. The classes are InstPrinters, but the library is ${target}AsmPrinter.
This patch is in response to bryant pointing out to me that Lanai was the only backend deviating from convention here. Thanks!
Reviewers: jpienaar, bryant
Subscribers: mgorny, jgosnell, llvm-commits
Differential Revision: https://reviews.llvm.org/D29043
llvm-svn: 292953
Also fixes a much worse bug where we emitted the wrong gap size for the
def range uncovered by the test for this issue.
Fixes PR31726.
llvm-svn: 292949
Summary:
When conditional branches with complex conditions are split into
multiple branches in SelectionDAGBuilder::FindMergedConditions, also
handle inverted conditions. These may sometimes appear without having
been optimized by InstCombine when CodeGenPrepare decides to sink and
duplicate cmp instructions, causing them to have only one use. This
problem can be increased by e.g. GVNHoist hiding more cmps from
InstCombine by combining equivalent cmps from different blocks.
For example codegen X & !(Y | Z) as:
jmp_if_X TmpBB
jmp FBB
TmpBB:
jmp_if_notY Tmp2BB
jmp FBB
Tmp2BB:
jmp_if_notZ TBB
jmp FBB
Reviewers: bogner, MatzeB, qcolombet
Subscribers: llvm-commits, hiraditya, mcrosier, sebpop
Differential Revision: https://reviews.llvm.org/D28380
llvm-svn: 292944
a lazy-asserting PoisoningVH.
AssertVH is fundamentally incompatible with cache-invalidation of
analysis results. The invaliadtion happens after the AssertingVH has
already fired. Instead, use a PoisoningVH that will assert if the
dangling handle is ever used rather than merely be assigned or
destroyed.
This patch also removes all of the (numerous) doomed attempts to work
around this fundamental incompatibility. It is a pretty significant
simplification IMO.
The most interesting change is in the Inliner where we still do some
clearing because we don't want to rely on the coarse grained
invalidation strategy of the containing pass manager. However, I prefer
the approach that contains this logic to the cleanup phase of the
Inliner, and I think we could enhance the CGSCC analysis management
layer to make this even better in the future if desired.
The rest is straight cleanup.
I've also added a test for one of the harder cases to work around: when
a *module analysis* contains many AssertingVHes pointing at functions.
Differential Revision: https://reviews.llvm.org/D29006
llvm-svn: 292928
Added early out for single undef input - we were already supporting (and testing) this in the constant folding code, we just do it quicker now
Drop undef handling from demanded elts code now that we handle it fully in InstCombiner::visitCallInst
llvm-svn: 292913
Summary:
Use the O_CLOEXEC flag only when it is available. Some old systems (e.g.
SLES10) do not support this flag. POSIX explicitly guarantees that this
flag can be checked for using #if, so there is no need for a CMake
check.
In case O_CLOEXEC is not supported, fall back to fcntl(FD_CLOEXEC)
instead.
Reviewers: rnk, rafael, mgorny
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D28894
llvm-svn: 292912
Summary:
This adds a cross-platform way of setting the current working directory
analogous to the existing current_path() function used for retrieving
it. The function will be used in lldb.
Reviewers: rafael, silvas, zturner
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D29035
llvm-svn: 292907
Removed data members ReduxWidth and MinVecRegSize + some C++11 stylish
improvements.
Differential Revision: https://reviews.llvm.org/D29010
llvm-svn: 292899
With this change dominator tree remains in sync after each step of loop
peeling.
Differential Revision: https://reviews.llvm.org/D29029
llvm-svn: 292895
Verifications of dominator tree and loop info are expensive operations
so they are disabled by default. They can be enabled by command line
options -verify-dom-info and -verify-loop-info. These options however
enable checks only in files Dominators.cpp and LoopInfo.cpp. If some
transformation changes dominaror tree and/or loop info, it would be
convenient to place similar checks to the files implementing the
transformation.
This change makes corresponding flags global, so they can be used in
any file to optionally turn verification on.
llvm-svn: 292889
The GeneralShuffle::add() method used to have an assert that made sure that
source elements were at least as big as the destination elements. This was
wrong, since it is actually expected that an EXTRACT_VECTOR_ELT node with a
smaller source element type than the return type gets extended.
Therefore, instead of asserting this, it is just checked and if this is the
case 'false' is returned from the GeneralShuffle::add() method. This case
should be very rare and is not handled further by the backend.
Review: Ulrich Weigand.
llvm-svn: 292888
Summary:
This teaches getNode to simplify extracting from Undef. This is similar to what is done for EXTRACT_VECTOR_ELT. It also adds support for extracting from CONCAT_VECTOR when we can reuse one of the inputs to the concat. These seem like simple non-target specific optimizations.
For X86 we currently handle undef in extractSubvector, but not all EXTRACT_SUBVECTOR creations go through there.
Ultimately, my motivation here is to simplify extractSubvector and remove custom lowering for EXTRACT_SUBVECTOR since we don't do anything but handle undef and BUILD_VECTOR optimizations, but those should be DAG combines.
Reviewers: RKSimon, delena
Reviewed By: RKSimon
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D29000
llvm-svn: 292876
Summary:
There's a comment in XorSlowCase that says "0^0==1" which isn't true. 0 xored with 0 is still 0. So I don't think we need to clear any unused bits here.
Now there is no difference between XorSlowCase and AndSlowCase/OrSlowCase other than the operation being performed
Reviewers: majnemer, MatzeB, chandlerc, bkramer
Reviewed By: MatzeB
Subscribers: chfast, llvm-commits
Differential Revision: https://reviews.llvm.org/D28986
llvm-svn: 292873
A register unit may be allocatable and non-reserved but some of the
register(tuples) built with it are reserved. We still need to calculate
liveness in this case.
Note to out of tree targets: If you start seeing machine verifier errors
with this commit, it probably means that you do not properly mark super
registers of reserved register as reserved. See for example r292836 or
r292870 for example on how to fix that.
rdar://29996737
Differential Revision: https://reviews.llvm.org/D28881
llvm-svn: 292871
When a register like R1 is reserved, X1 should be reserved as well. This
was already done "manually" when 64bit code was enabled, however using
the markSuperRegs() function on the base register is more convenient and
allows to use the checksAllSuperRegsMarked() function even in 32bit mode
to avoid accidental breakage in the future.
This is also necessary to allow https://reviews.llvm.org/D28881
Differential Revision: https://reviews.llvm.org/D29056
llvm-svn: 292870
Running non-LCSSA-preserving LoopSimplify followed by LCSSA on (roughly) the
same loop is incorrect, since LoopSimplify may break LCSSA arbitrarily higher
in the loop nest. Instead, run LCSSA first, and then run LCSSA-preserving
LoopSimplify on the result.
This fixes PR31718.
Differential Revision: https://reviews.llvm.org/D29055
llvm-svn: 292854
Summary: promoteIndirectCall should be a utility function that could be invoked by other optimization passes.
Reviewers: davidxl
Reviewed By: davidxl
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D29051
llvm-svn: 292850
Summary:
The LibFunc::Func enum holds enumerators named for libc functions.
Unfortunately, there are real situations, including libc implementations, where
function names are actually macros (musl uses "#define fopen64 fopen", for
example; any other transitively visible macro would have similar effects).
Strictly speaking, a conforming C++ Standard Library should provide any such
macros as functions instead (via <cstdio>). However, there are some "library"
functions which are not part of the standard, and thus not subject to this
rule (fopen64, for example). So, in order to be both portable and consistent,
the enum should not use the bare function names.
The old enum naming used a namespace LibFunc and an enum Func, with bare
enumerators. This patch changes LibFunc to be an enum with enumerators prefixed
with "LibFFunc_". (Unfortunately, a scoped enum is not sufficient to override
macros.)
There are additional changes required in clang.
Reviewers: rsmith
Subscribers: mehdi_amini, mzolotukhin, nemanjai, llvm-commits
Differential Revision: https://reviews.llvm.org/D28476
llvm-svn: 292848
When calculating kills, a register may be considered live because a part
of it is live, but if there is a use of that (whole) register, the whole
register (and its subregisters) need to be added to the live set.
llvm-svn: 292845
Summary:
This patch changes the layout of DoubleAPFloat, and adjust all
operations to do either:
1) (IEEEdouble, IEEEdouble) -> (uint64_t, uint64_t) -> PPCDoubleDoubleImpl,
then run the old algorithm.
2) Do the right thing directly.
1) includes multiply, divide, remainder, mod, fusedMultiplyAdd, roundToIntegral,
convertFromString, next, convertToInteger, convertFromAPInt,
convertFromSignExtendedInteger, convertFromZeroExtendedInteger,
convertToHexString, toString, getExactInverse.
2) includes makeZero, makeLargest, makeSmallest, makeSmallestNormalized,
compare, bitwiseIsEqual, bitcastToAPInt, isDenormal, isSmallest,
isLargest, isInteger, ilogb, scalbn, frexp, hash_value, Profile.
I could split this into two patches, e.g. use
1) for all operatoins first, then incrementally change some of them to
2). I didn't do that, because 1) involves code that converts data between
PPCDoubleDoubleImpl and (IEEEdouble, IEEEdouble) back and forth, and may
pessimize the compiler. Instead, I find easy functions and use
approach 2) for them directly.
Next step is to implement move multiply and divide from 1) to 2). I don't
have plans for other functions in 1).
Differential Revision: https://reviews.llvm.org/D27872
llvm-svn: 292839
in llvm-objdump for Mach-O files add the printing of the
x86_thread_state32_t in the same format as
otool-classic(1) on darwin.
To do this the 32-bit x86 general tread state
needed to be defined in include/llvm/Support/MachO.h .
rdar://30110111
llvm-svn: 292829
Since we're now avoiding operations using narrow scalar integer types,
we have to legalize the integer side of the FP conversions.
This requires teaching the legalizer how to do that.
llvm-svn: 292828
Since r279760, we've been marking as legal operations on narrow integer
types that have wider legal equivalents (for instance, G_ADD s8).
Compared to legalizing these operations, this reduced the amount of
extends/truncates required, but was always a weird legalization decision
made at selection time.
So far, we haven't been able to formalize it in a way that permits the
selector generated from SelectionDAG patterns to be sufficient.
Using a wide instruction (say, s64), when a narrower instruction exists
(s32) would introduce register class incompatibilities (when one narrow
generic instruction is selected to the wider variant, but another is
selected to the narrower variant).
It's also impractical to limit which narrow operations are matched for
which instruction, as restricting "narrow selection" to ranges of types
clashes with potentially incompatible instruction predicates.
Concerns were also raised regarding MIPS64's sign-extended register
assumptions, as well as wrapping behavior.
See discussions in https://reviews.llvm.org/D26878.
Instead, legalize the operations.
Should we ever revert to selecting these narrow operations, we should
try to represent this more accurately: for instance, by separating
a "concrete" type on operations, and an "underlying" type on vregs, we
could move the "this narrow-looking op is really legal" decision to the
legalizer, and let the selector use the "underlying" vreg type only,
which would be guaranteed to map to a register class.
In any case, we eventually should mitigate:
- the performance impact by selecting no-op extract/truncates to COPYs
(which we currently do), and the COPYs to register reuses (which we
don't do yet).
- the compile-time impact by optimizing away extract/truncate sequences
in the legalizer.
llvm-svn: 292827
This is a series of patches to enable adding of machine sched
models for ARM processors easier and compact. They define new
sched-readwrites for groups of ARM instructions. This has been
missing so far, and as a consequence, machine scheduler models
for individual sub-targets have tended to be larger than they
needed to be.
The current patch focuses on floating-point instructions.
Reviewers: Diana Picus (rovka), Renato Golin (rengolin)
Differential Revision: https://reviews.llvm.org/D28194
llvm-svn: 292825
Summary:
Add a new load command LC_BUILD_VERSION. It is a generic version of
LC_*_VERSION_MIN load_command used on Apple platforms. Instead of having
a seperate load command for each platform, LC_BUILD_VERSION is recording
platform info as an enum. It also records SDK version, min_os, and tools
that used to build the binary.
rdar://problem/29781291
Reviewers: enderby
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D29044
llvm-svn: 292824
Vector immediate load instructions should have the isAsCheapAsAMove, isMoveImm
and isReMaterializable flags set. With them, these instruction will get
hoisted out of loops.
Review: Ulrich Weigand
llvm-svn: 292790
invalidation of deleted functions in GlobalDCE.
This was always testing a bug really triggered in GlobalDCE. Right now
we have analyses with asserting value handles into IR. As long as those
remain, when *deleting* an IR unit, we cannot wait for the normal
invalidation scheme to kick in even though it was designed to work
correctly in the face of these kinds of deletions. Instead, the pass
needs to directly handle invalidating the analysis results pointing at
that IR unit.
I've tought the Inliner about this and this patch teaches GlobalDCE.
This will handle the asserting VH case in the existing test as well as
other issues of the same fundamental variety. I've moved the test into
the GlobalDCE directory and added a comment explaining what is going on.
Note that we cannot simply require LVI here because LVI is too lazy.
llvm-svn: 292773
clearing its body. This is essential to avoid triggering asserting value
handles in analyses on the function's body.
I'm working on a test case for this behavior in LLVM, but Clang has
a great one that managed to trigger this on all of the bots already.
llvm-svn: 292770
become unavailable.
The AssumptionCache is now immutable but it still needs to respond to
DomTree invalidation if it ended up caching one.
This lets us remove one of the explicit invalidates of LVI but the
other one continues to avoid hitting a latent bug.
llvm-svn: 292769
new PM's inliner.
The bug happens when we refine an SCC after having computed a proxy for
the FunctionAnalysisManager, and then proceed to compute fresh analyses
for functions in the *new* SCC using the manager provided by the old
SCC's proxy. *And* when we manage to mutate a function in this new SCC
in a way that invalidates those analyses. This can be... challenging to
reproduce.
I've managed to contrive a set of functions that trigger this and added
a test case, but it is a bit brittle. I've directly checked that the
passes run in the expected ways to help avoid the test just becoming
silently irrelevant.
This gets the new PM back to passing the LLVM test suite after the PGO
improvements landed.
llvm-svn: 292757
We need to set BINARY_DIR to: ${CMAKE_BINARY_DIR}/lib/Fuzzer/test , so the dll
is placed in the same directory than the test LLVMFuzzer-DSOTest, and is found
when executing that test.
As we are using CMAKE_CXX_CREATE_SHARED_LIBRARY to link the dll, we can't modify
the output directory for the import library. It will be created in the same
directory than the dll (in BINARY_DIR), no matter which value we set to
LIBRARY_DIR. So, if we set LIBRARY_DIR to a different directory than BINARY_DIR,
when linking LLVMFuzzer-DSOTest, cmake will look for the import library
LLVMFuzzer-DSO1.lib in LIBRARY_DIR, and won't find it, since it was created in
BINARY_DIR. So, for Windows, we need that LIBRARY_DIR and BINARY_DIR are the
same directory.
Differential Revision: https://reviews.llvm.org/D27870
llvm-svn: 292748
Don't check for InFuzzingThread() on Windows, since the AlarmHandler() is
always executed by a different thread from a thread pool.
If we don't add these changes, the alarm handler will never execute.
Note that we decided to ignore possible problem in the synchronization.
Differential Revision: https://reviews.llvm.org/D28723
llvm-svn: 292746
I add 2 changes to make the tests work on 32 bits and on 64 bits.
I change the size allocated to 0x20000000 and add the flag: -rss_limit_mb=300.
Otherwise the output for 32 bits and 64 bits is different.
For 64 bits the value 0xff000000 doesn't exceed kMaxAllowedMallocSize.
For 32 bits, kMaxAllowedMallocSize is set to 0xc0000000, so the call to
Allocate() will fail earlier printing "WARNING: AddressSanitizer failed to
allocate ..." , and wont't call malloc hooks.
So, we need to consider a size smaller than 2GB (so malloc doesn't fail on
32bits) and greater that the value provided by -rss_limit_mb.
Because of that I use: 0x20000000.
Differential Revision: https://reviews.llvm.org/D28706
llvm-svn: 292744
Fix libFuzzer when setting -close_fd_mask to a non-zero value.
In previous implementation, libFuzzer closes the file descriptors for
stdout/stderr. This has some disavantages:
For `fuzzer-fdmask.test`, we write directly to stdout and stderr using the
file streams stdout and stderr, after the file descriptors are closed, which is
undefined behavior. In Windows, in particular, this was making the test fail.
Also, if we close stdout and we open a new file in libFuzzer, we get the file
descriptor 1, which could generate problem if some code assumes file descriptors
refers to stdout and works directly writing to the file descriptor 1, but it
will be writing to the opened file (for example using std::cout).
Instead of closing the file descriptors, I redirect the output to /dev/null on
linux and nul on Windows.
Differential Revision: https://reviews.llvm.org/D28718
llvm-svn: 292743
This changes is necessary on Windows, where libraries doesn't include the prefix
"lib".
Differential Revision: https://reviews.llvm.org/D28710
llvm-svn: 292742
Update `ListFilesInDirRecursive` implementation on Windows to have the same
behavior than for Posix, when the directory doesn't exists and when it is empty.
Differential Revision: https://reviews.llvm.org/D28711
llvm-svn: 292741
Instead of directly using objdump, which is not present on Windows, we consider
different tools depending on the platform.
For Windows, we consider dumpbin and llvm-objdump.
Differential Revision: https://reviews.llvm.org/D28635
llvm-svn: 292739
We need to build all the tests with -O0, otherwise optimizations may merge some
basic blocks and the tests will fail.
In this diff, I simplify the cmake implementation and I remove the flags for
Windows too (/O[123s]).
Differential Revision: https://reviews.llvm.org/D28632
llvm-svn: 292737
We need to expose Sanitizer Coverage's functions that are rewritten with a
different implementation, so compiler-rt's libraries have access to it.
Differential Revision: https://reviews.llvm.org/D28618
llvm-svn: 292736
Remove dependency on FileCheck, sancov and not for tests on Windows.
If LLVM_USE_SANITIZER=Address and LLVM_USE_SANITIZE_COVERAGE=YES, this will
trigger the building of dependencies with sanitizer instrumentation.
This will fail in Windows, since cmake will use link.exe for linking and won't
include compiler-rt libraries.
Differential Revision: https://reviews.llvm.org/D27993
llvm-svn: 292735
We may be able to assert that no shl-shl or lshr-lshr pairs ever get here
because we should have already handled those in foldShiftedShift().
llvm-svn: 292726
This is similar to what the caller (matchSelectPattern()) does. In all
cases where we succeed in matching a min/max pattern, the values in
that pattern will be the values of the 'select', so hoist that and
remove a bunch of duplicated code.
llvm-svn: 292725
the library routine shared with the new PM and other code.
This assert checks that when LCSSA preservation is requested we start in
LCSSA form. Without this early assert, given *very* complex test cases
we can hit an assert or crash much later on when trying to preserve
LCSSA.
The new PM's loop simplify doesn't need to (and indeed can't) preserve
LCSSA as the new PM doesn't deal in transforms in the dependency graph.
But we asked the library to and shockingly, this didn't work very well!
Stop doing that. Now the assert will tell us immediately with existing
test cases. Before this, it took a pretty convoluted input to trigger
this.
However, sinking the assert also found a bug in LoopUnroll where we
asked simplifyLoop to preserve LCSSA *right before we reform it*. That's
kinda silly and unsurprising that it wasn't available. =D Stop doing
that too.
We also would assert that the unrolled loop was in LCSSA even if
preserving LCSSA was never requested! I don't have a test case or
anything here. I spotted it by inspection and it seems quite obvious. No
logic change anyways, that's just avoiding a spurrious assert.
llvm-svn: 292710
Re-Commit r292543 with a fix for the situation when the chain end is
MBB.end().
This function can be used to accumulate the set of all read and modified
register in a sequence of instructions.
Use this code in AArch64A57FPLoadBalancing::scavengeRegister() to prove
the concept.
- The AArch64A57LoadBalancing code is using a backwards analysis now
which is irrespective of kill flags. This is the main motivation for
this change.
Differential Revision: http://reviews.llvm.org/D22082
llvm-svn: 292705
Summary:
Under option -mergefunc-preserve-debug-info we:
- Do not create a new function for a thunk.
- Retain the debug info for a thunk's parameters (and associated
instructions for the debug info) from the entry block.
Note: -debug will display the algorithm at work.
- Create debug-info for the call (to the shared implementation) made by
a thunk and its return value.
- Erase the rest of the function, retaining the (minimally sized) entry
block to create a thunk.
- Preserve a thunk's call site to point to the thunk even when both occur
within the same translation unit, to aid debugability. Note that this
behaviour differs from the underlying -mergefunc implementation which
modifies the thunk's call site to point to the shared implementation
when both occur within the same translation unit.
Reviewers: echristo, eeckstein, dblaikie, aprantl, friss
Reviewed By: aprantl
Subscribers: davide, fhahn, jfb, mehdi_amini, llvm-commits
Differential Revision: https://reviews.llvm.org/D28075
llvm-svn: 292702