AArch64 has specific instructions to multiply two numbers at double the width
and produce the high part of the result. These can be used to implement LLVM's
mul.with.overflow instructions fairly simply. Helps with C++ operator new[].
llvm-svn: 294519
Implement getRegPressureLimit and getRegPressureSetLimit callbacks in
SIRegisterInfo.
This makes standard converge scheduler to behave almost the same as
GCNScheduler, sometime slightly better sometimes a bit worse.
In gerenal that is also possible to switch GCNScheduler to use these
callbacks instead of getMaxWaves(), which also makes GCNScheduler
slightly better on some tests and slightly worse on another. A big
win is behavior with converge scheduler.
Note, these are used not only by scheduling, but in places like
MachineLICM.
Differential Revision: https://reviews.llvm.org/D29700
llvm-svn: 294518
Summary:
This patch removes the over-specified dependencies from LLDBDependencies and instead relies on the dependencies as expressed in each library and tool.
This also removes the library looping in favor of allowing CMake to do its thing. I've tested this patch on Darwin, and found no issues, but since linker semantics vary by system I'll also work on testing it on other platforms too.
Help testing would be greatly appreciated.
Reviewers: labath, zturner
Subscribers: danalbert, srhines, mgorny, jgosnell, lldb-commits
Differential Revision: https://reviews.llvm.org/D29352
llvm-svn: 294515
Summary: This patch is required by D28855, and enables us to rely on CMake's ability to handle out of order target dependencies.
Reviewers: mgorny, chapuni, bryant
Subscribers: llvm-commits, jgosnell
Differential Revision: https://reviews.llvm.org/D28869
llvm-svn: 294514
In r293373 we switched the build to linking dynamically against the
Universal CRT and include the redistributables in the installer.
However, clang-format.exe is copied into the vsix and needs to be
statically linked. This commit makes us build the plugin in a separate
step that uses static linking.
llvm-svn: 294513
This module will contain nothing but vtable definitions and (soon)
available_externally function definitions, so there is no point in keeping
debug info in the module.
Differential Revision: https://reviews.llvm.org/D28913
llvm-svn: 294511
When building for Windows, we would check if we were using MSVC rather
than WIN32. This resulted in needed targets not being defined by
sanitizer_common. Fix the conditional.
When registering the objects libraries for ASAN, we would multiply
register for all targets as we were creating them inside a loop over all
architectures. Only define the target per architecture.
llvm-svn: 294510
Fix the test zlib conditional to use LLVM_ENABLE_ZLIB value when
building stand-alone. The HAVE_LIBZ is not available when performing
a stand-alone build. Since the zlib support is a feature of
the underlying LLVM library, it exports the actual status as the final
value of LLVM_ENABLE_ZLIB in LLVMConfig.
While at it, canonicalize the boolean value into 0/1 and remove unused
CMake definitions (most likely copied from clang).
Differential Revision: https://reviews.llvm.org/D29340
llvm-svn: 294508
Use both LLD- and LLVM-specific binary&library directories when LLD is
being built stand-alone. This ensures that the freshly built tools and
libraries are found and used correctly.
Without this patch, the test suite uses LLVM_TOOLS_DIR and LLVM_LIBS_DIR
to locate lld, and set PATH and LD_LIBRARY_PATH. When doing
a stand-alone builds, these variables represent the installed LLVM.
As a result, tests either fail due to missing lld executables/libraries
or use an earlier installed LLD version rather than the one being built.
To solve this, an additional LLD_TOOLS_DIR and LLD_LIBS_DIR variables
are added that are populated using LLVM_*_OUTPUT_INTDIR. Those variables
are populated with directories used to output built executables
and libraries. In stand-alone builds, they represent the directories
used by LLD. In integrated builds, they have the same values as
LLVM_*_DIR and therefore using them does not harm.
The new variables are prepended to PATH and LD_LIBRARY_PATH to ensure
that freshly built binaries are preferred over potentially earlier
installed ones. Furthermore, the resulting PATH is used to locate tools
for substitutions since the search includes both tools built as part of
LLD and of LLVM.
Differential Revision: https://reviews.llvm.org/D29335
llvm-svn: 294507
nested-name-specifier (as the standard appears to require), treat it as the
type specifier 'decltype(auto)' followed by a nested-name-specifier starting
with '::'.
llvm-svn: 294506
Making the cost model selecting between Interleave, GatherScatter or Scalar vectorization form of memory instruction.
The right decision should be done for non-consecutive memory access instrcuctions that may have more than one vectorization solution.
This patch includes the following changes:
- Cost Model calculates the cost of Load/Store vector form and choose the better option between Widening, Interleave, GatherScactter and Scalarization. Cost Model keeps the widening decision.
- Arrays of Uniform and Scalar values are moved from Legality to Cost Model.
- Cost Model collects Uniforms and Scalars per VF. The collection is based on CM decision map of Loadis/Stores vectorization form.
- Vectorization of memory instruction is performed according to the CM decision.
Differential Revision: https://reviews.llvm.org/D27919
llvm-svn: 294503
The test is failing on the bot because "/subsystem:console" was
truncated for some reason. I don't know why that is happening on
that machine (it is not reproducible on my Windows machine).
In this patch, I'm trying to tame it by making the output shorter.
llvm-svn: 294502
Summary: This adds an option to save temporary files generated during link-time optimization. This can be useful for debugging.
Reviewers: ruiu, pcc
Reviewed By: ruiu, pcc
Subscribers: mehdi_amini
Differential Revision: https://reviews.llvm.org/D29518
llvm-svn: 294498
Summary: lib/sanitizer_common/sanitizer_win_defs.h defineds WINAPI, which is also defined by standard Windows headers. Redefining it causes warnings during compilation. This change causes us to only define WINAPI if it is not already defined, which avoids the warnings.
Reviewers: rnk, zturner, mpividori
Reviewed By: rnk, mpividori
Subscribers: kubamracek
Differential Revision: https://reviews.llvm.org/D29683
llvm-svn: 294497
The AAPCS ABI is substantially more complicated so that's coming in a separate
patch. For now we can generate correct code for iOS though.
llvm-svn: 294493
Summary:
These flags allow specifying extra arguments to the tool's command
line which don't appear in the compilation database.
Reviewers: alexfh, klimek, bkramer
Subscribers: JDevlieghere, cfe-commits
Differential Revision: https://reviews.llvm.org/D29699
llvm-svn: 294491
This is a follow-up to https://reviews.llvm.org/D29349. It turns out
that NeedUpgradeToDIGlobalVariableExpression is always necessary when
we encountered a version==0 record because it may always be referenced
via a list of globals in a DICompileUnit. My tests weren't good enough
to catch this though. To trigger this case, we need much older bitcode
produced by LLVM around version 3.7.
<rdar://problem/30404262>
Differential Revision: https://reviews.llvm.org/D29693
llvm-svn: 294488
with temporarily file name fix in testcase.
Original commit message:
-q, --emit-relocs - Generate relocations in output
Simplest implementation:
* no GC case,
* no "/DISCARD/" linkerscript command support.
This patch is extracted from D28612 / D29636,
Relative to PR31579.
Differential revision: https://reviews.llvm.org/D29663
llvm-svn: 294469
Sometimes the MS ABI needs to emit thunks for declarations that don't
have bodies. Destructor thunks make calls to inlinable functions, so
they need line info or LLVM will complain.
Fixes PR31893
llvm-svn: 294465
-q, --emit-relocs - Generate relocations in output
Simplest implementation:
* no GC case,
* no "/DISCARD/" linkerscript command support.
This patch is extracted from D28612 / D29636,
Relative to PR31579.
Differential revision: https://reviews.llvm.org/D29663
llvm-svn: 294464
Summary:
LVI is now depth first, which is optimal for iteration strategy in
terms of work per call. However, the way the results get cached means
it can still go very badly N^2 or worse right now. The overdefined
cache is per-block, because LVI wants to try to get different results
for the same name in different blocks (IE solve the problem
PredicateInfo solves). This means even if we discover a value is
overdefined after going very deep, it doesn't cache this information,
causing it to end up trying to rediscover it again and again. The
same is true for values along the way. In practice, overdefined
anywhere should mean overdefined everywhere (this is how, for example,
SCCP works).
Until we get around to reworking the overdefined cache, we need to
limit the worklist size we process. Note that permanently reverting
the DFS strategy exploration seems the wrong strategy (temporarily
seems fine if we really want). BFS is clearly the wrong approach, it
just gets luckier on some testcases. It's also very hard to design
an effective throttle for BFS. For DFS, the throttle is directly related
to the depth of the CFG. So really deep CFGs will get cutoff, smaller
ones will not. As the CFG simplifies, you get better results.
In BFS, the limit is it's related to the fan-out times average block size,
which is harder to reason about or make good choices for.
Bug being filed about the overdefined cache, but it will require major
surgery to fix it (plumbing predicateinfo through CVP or LVI).
Note: I did not make this number configurable because i'm not sure
anyone really needs to tweak this knob. We run CVP 3 times. On the
testcases i have the slow ones happen in the middle, where CVP is
doing cleanup work other things are effective at. Over the course of
3 runs, we don't see to have any real loss of performance.
I haven't gotten a minimized testcase yet, but just imagine in your
head a testcase where, going *up* the CFG, you have branches, one of
which leads 50000 blocks deep, and the other, to something where the
answer is overdefined immediately. BFS would discover the overdefined
faster than DFS, but do more work to do so. In practice, the right
answer is "once DFS discovers overdefined for a value, stop trying to
get more info about that value" (and so, DFS would normally cache the
overdefined results for every value it passed through in those 50k
blocks, and never do that work again. But it don't, because of the
naming problem)
Reviewers: chandlerc, djasper
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D29715
llvm-svn: 294463