Commit Graph

381721 Commits

Author SHA1 Message Date
Stephen Kelly 243cd0afad [ASTMatchers] Make Param functors variadic
Differential Revision: https://reviews.llvm.org/D97156
2021-03-03 11:41:20 +00:00
Sam McCall 1a4990a4f7 [clangd] Fix uninit member 2021-03-03 11:45:16 +01:00
JinGu Kang 394a4d0433 [AArch64] Add missing intrinsics for vcls
Differential Revision: https://reviews.llvm.org/D97775
2021-03-03 10:17:56 +00:00
Mikael Holmen 85b67d5fa9 [lld][MachO] Silence "enumeral and non-enumeral type" warning from gcc
gcc complained with

[1110/1140] Building CXX object tools/lld/MachO/CMakeFiles/lldMachO2.dir/SyntheticSections.cpp.o
../../lld/MachO/SyntheticSections.cpp: In function 'int16_t ordinalForDylibSymbol(const lld::macho::DylibSymbol&)':
../../lld/MachO/SyntheticSections.cpp:287:14: warning: enumeral and non-enumeral type in conditional expression [-Wextra]
  286 |   return config->namespaceKind == NamespaceKind::flat || dysym.isDynamicLookup()
      |          ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  287 |              ? MachO::BIND_SPECIAL_DYLIB_FLAT_LOOKUP
      |              ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  288 |              : dysym.getFile()->ordinal;
      |              ~~~~~~~~~~~~~~~~~~~~~~~~~~
2021-03-03 10:39:35 +01:00
Andy Yankovsky 3b47bd32f9 [lldb] Fix handling of `DW_AT_decl_file` according to D91014 (attempt #2)
Apply changes from https://reviews.llvm.org/D91014 to other places where DWARF entries are being processed.

Test case is provided by @jankratochvil.
The test is marked to run only on x64 and exclude Windows and Darwin, because the assembly is not OS-independent.

(First attempt https://reviews.llvm.org/D96778 broke the build bots)

Reviewed By: jankratochvil

Differential Revision: https://reviews.llvm.org/D97765
2021-03-03 10:27:35 +01:00
Piotr Sobczak c3ce7bae80 [AMDGPU] Rename amdgcn_wwm to amdgcn_strict_wwm
* Introduce the new intrinsic amdgcn_strict_wwm
 * Deprecate the old intrinsic amdgcn_wwm

The change is done for consistency as the "strict"
prefix will become an important, distinguishing factor
between amdgcn_wqm and amdgcn_strictwqm in the future.

The "strict" prefix indicates that inactive lanes do not
take part in control flow, specifically an inactive lane
enabled by a strict mode will always be enabled irrespective
of control flow decisions.

The amdgcn_wwm will be removed, but doing so in two steps
gives users time to switch to the new name at their own pace.

Reviewed By: critson

Differential Revision: https://reviews.llvm.org/D96257
2021-03-03 09:33:57 +01:00
Carl Ritson 2ddac69f98 [AMDGPU] Rename llvm.amdgcn.msaa.load to llvm.amdgcn.msaa.load.x
While the underlying instruction is called image_msaa_load,
the resource must be x component only.
Rename the intrinsic for clarity.

Reviewed By: foad

Differential Revision: https://reviews.llvm.org/D97829
2021-03-03 17:30:39 +09:00
Thomas Preud'homme 09c3573903 [FileCheck] Do not skip end of line in diagnostics
When commit da108b4ed4 introduced
the CHECK-NEXT directive, it added logic to skip to the next line when
printing a diagnostic if the current matching position is at the end of
a line. This was fine while FileCheck did not support regular expression
but since it does now it can be confusing when the pattern to match
starts with the expectation of a newline (e.g. CHECK-NEXT: {{\n}}foo).
It is also inconsistent with the column information in the diagnostic
which does point to the end of line.

This commit removes this logic altogether, such that failure to match
diagnostic for such cases would show the end of line and be consistent
with the column information. The commit also adapts all existing
testcases accordingly.

Note to reviewers: An alternative approach would be to restrict the code
to only skip to the next line if the first character of the pattern is
known not to match a whitespace-like character. This would respect the
original intent but keep the inconsistency in terms of column info and
requires more code. I've only chosen this current approach by laziness
and would be happy to restrict the logic instead.

Reviewed By: jdenny, jhenderson

Differential Revision: https://reviews.llvm.org/D93341
2021-03-03 08:20:39 +00:00
Petr Hosek 6e3946c9f5 [runtimes] Use standalone build only for compiler-rt
compiler-rt needs to use standalone build because of the assumptions
made by its build, but other runtimes can use non-standalone build.

Differential Revision: https://reviews.llvm.org/D97575
2021-03-03 00:06:20 -08:00
David Green ab280cbaa3 [ARM] Ensure undef is propagated to CBZ/CBNZ flags
In some rare circumstances we can be using an undef register for a
compare. When folded into a CBZ/CBNZ the undef flags are lost, leading
to machine verifier problems. This propagates the existing flags to the
new instruction.
2021-03-03 08:02:58 +00:00
Andy Wingo 4307069df4 [WebAssembly] Swap operand order of call_indirect in text format
The WebAssembly text and binary formats have different operand orders
for the "type" and "table" fields of call_indirect (and
return_call_indirect).  In LLVM we use the binary order for the MCInstr,
but when we produce or consume the text format we should use the text
order.  For compilation units targetting WebAssembly 1.0 (without the
reference types feature), we omit the table operand entirely.

Differential Revision: https://reviews.llvm.org/D97761
2021-03-03 08:51:21 +01:00
Prateek Pardeshi 50e34497ac [Polly] Refabricating IsOutermostParallel() from Integer Set Libarary(ISL) to take the C++ wrapper
Polly use algorithms from the Integer Set Library (isl), which is a library written in C and which is incompatible with the rest of the LLVM  as it is written in C++.

Changes made:
* Refabricating IsOutermostParallel() to take C++ bindings instead of reference-counting in C isl lib.
* Addition of manage_copy() to be used as reference for C objects instead of IsOutermostParallel()

Reviewed By: Meinersbur

Differential Revision: https://reviews.llvm.org/D97751
2021-03-03 01:49:37 -06:00
Hsiangkai Wang f7e675b3da [RISCV] Use RISCVV_BUILTIN for vector intrinsic checking.
There may be other BUILTINs for other extensions. Use RISCVV_BUILTIN for
vector builtin checking.

Differential Revision: https://reviews.llvm.org/D97825
2021-03-03 13:42:54 +08:00
Qiu Chaofan 72d4a41ba6 [PowerPC] Allow spilling GPR to VSR on AIX
This patch enables spilling GPR to VSRs instead of stack under AIX ABI.

Reviewed By: nemanjai

Differential Revision: https://reviews.llvm.org/D97367
2021-03-03 13:32:39 +08:00
Craig Topper 543b901e58 [LegalizeVectorTypes] Improve SplitVecRes_INSERT_SUBVECTOR to handle subvector being in the high half of the split or not at element 0 of the low half.
This function isn't exercised in lit tests today today according to
the code coverage report. But will be after the tests in D97543 and
D97559.

Posting this patch to help a crash that Fraser hit.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D97582
2021-03-02 21:14:13 -08:00
Jianzhou Zhao ac4c1760b2 Fix the build error caused by D97570 2021-03-03 04:47:00 +00:00
Jianzhou Zhao d866b9c99d [dfsan] Propagate origin tracking at load
This is a part of https://reviews.llvm.org/D95835.

One issue is about origin load optimization: see the
comments of useCallbackLoadLabelAndOrigin

@gbalats This change may have some conflicts with your 8bit change. PTAL the change at visitLoad.

Reviewed By: morehouse, gbalats

Differential Revision: https://reviews.llvm.org/D97570
2021-03-03 04:32:30 +00:00
Nathan James 335375ef2c
[clang][NFC] pack StaticDiagInfoRec
Exchanging types, reordering fields and borrowing a bit from OptionGroupIndex shrinks this from 12 bytes to 8.
This knocks ~20k from the binary size.

Reviewed By: aaron.ballman

Differential Revision: https://reviews.llvm.org/D97553
2021-03-03 02:53:10 +00:00
Nathan James 1a91b8232a
[clang-tidy][NFC] Use equalsBoundNode matchers to simplify LoopConvertCheck
Make use of the `equalsBoundNode` matcher to ensure Init, Conditon and Increment variables all refer to the same variable during matching.

Reviewed By: steveire

Differential Revision: https://reviews.llvm.org/D97639
2021-03-03 02:51:34 +00:00
Wang, Pengfei fd79aa7294 [NFC] Add x86_amx and some missed half, bfloat keywords to llvm plugin syntaxes
Reviewed By: LuoYuanke

Differential Revision: https://reviews.llvm.org/D97444
2021-03-03 10:01:10 +08:00
George Balatsouras 6ff18b08e6 [dfsan] Fix clang-tidy warnings
This addresses ~50 clang-tidy warnings on dfsan instrumentation pass.
It also contains some refactoring (all non-functional changes) to eliminate some variables and simplify code.

Reviewed By: stephan.yichao.zhao

Differential Revision: https://reviews.llvm.org/D97714
2021-03-02 17:37:45 -08:00
Felix Berger a189b3b9e8 [clang-tidy] performance-for-range-copy: Don't trigger on implicit type conversions.
This disables the check for false positive cases where implicit type conversion
through either an implicit single argument constructor or a member conversion
operator is triggered when constructing the loop variable.

Fix the test cases that meant to cover these cases.

Differential Revision: https://reviews.llvm.org/D97577

Reviewed-by: hokein
2021-03-02 20:02:48 -05:00
Petr Hosek b3ac90da1d Revert "[runtimes] Use standalone build only for compiler-rt"
This reverts commit 4e421b2323 as this
seemed to have broke Python 3 executable detection on some builders.
2021-03-02 16:59:32 -08:00
Petr Hosek 1d1983f2d0 [CMake] Enable Polly for Fuchsia toolchain build
We want to enable the use of Polly in Fuchsia.

Differential Revision: https://reviews.llvm.org/D97819
2021-03-02 16:51:16 -08:00
Jonas Devlieghere db8b1598b7 [lldb] Inline objc_opt->version >= 14 to avoid dealing with bool type 2021-03-02 16:41:44 -08:00
Jonas Devlieghere c85d47f7b8 [lldb] Add more logging to __lldb_apple_objc_v2_get_dynamic_class_info 2021-03-02 16:24:59 -08:00
Victor Huang 1756b2adc9 [AIX][TLS] Generate TLS variables in assembly files
This patch allows generating TLS variables in assembly files on AIX.
Initialized and external uninitialized variables are generated with the
.csect pseudo-op and local uninitialized variables are generated with
the .comm/.lcomm pseudo-ops. The patch also adds a check to
explicitly say that TLS is not yet supported on AIX.

Reviewed by: daltenty, jasonliu, lei, nemanjai, sfertile
Originally patched by: bsaleil
Commandeered by: NeHuang

Differential Revision: https://reviews.llvm.org/D96184
2021-03-02 18:22:48 -06:00
Petr Hosek 4e421b2323 [runtimes] Use standalone build only for compiler-rt
compiler-rt needs to use standalone build because of the assumptions
made by its build, but other runtimes can use non-standalone build.

Differential Revision: https://reviews.llvm.org/D97575
2021-03-02 16:21:35 -08:00
zoecarver 84a50f5911 [libc++] Add bind_front function (P0356R5).
Implementes [[ http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p0356r5.html | P0356R5 ]]. Adds `bind_front` to `functional`.

Reviewed By: ldionne, #libc, Quuxplusone

Differential Revision: https://reviews.llvm.org/D60368
2021-03-02 16:18:06 -08:00
Jonas Devlieghere f46a441b1c [lldb] Extend Python convenience variable table with equivalent APIs
Add a column to the table of convenience variables with the equivalent
API to get to the current debugger, target, process, etc.

We often get asked to make convenience variables available outside of
the interactive interpreter. After explaining why that's not possible, a
common complaint is that it's hard to find out how to get to these
variables in a non-interactive context, for example how to get to the
current frame when given a thread. This patch aims to alleviate that by
including the APIs to navigate between these instances in the table.

Differential revision: https://reviews.llvm.org/D97778
2021-03-02 16:13:55 -08:00
Neal (nealsid) 5826aa48f0 Migrate to llvm::unique_function instead of static member functions for callbacks
A few cleanups suggested in another patch review's comments:

1. Use llvm:unique_function for storing & invoking callbacks from
   Editline to IOHandler
2. Change return type of one of the callback setters from bool to void,
   since it's return value was never used
3. Moved the callback setters inline & made them nonstatic, since that's
   more consistent with other setter definitions
4. Removed the baton parameter since we no longer need it anymore

Differential revision: https://reviews.llvm.org/D50299
2021-03-02 16:13:54 -08:00
Arthur Eubanks 99f1e86cbb [opt] Error if -debug-pass is specified alongside the new PM
Reviewed By: ychen

Differential Revision: https://reviews.llvm.org/D97810
2021-03-02 15:59:28 -08:00
Andrei Elovikov b24afec8ae [NFCI][VPlan] Modify Recipes' print methods to honor Indent parameter
Reviewed By: fhahn

Differential Revision: https://reviews.llvm.org/D97787
2021-03-02 15:32:10 -08:00
Fangrui Song 1e46b6f401 [test] Fix CodeGen/VE/Scalar tests 2021-03-02 15:30:44 -08:00
Aart Bik 5b333d3449 [mlir][sparse] do not ignore ordering for "dense" tensor linked with sparse type
Reviewed By: bixia

Differential Revision: https://reviews.llvm.org/D97795
2021-03-02 15:21:51 -08:00
Peter Steinfeld 16005fd979 [flang] Detect circularly defined interfaces of procedures
It's possible to define a procedure whose interface depends on a procedure
which has an interface that depends on the original procedure.  Such a circular
definition was causing the compiler to fall into an infinite loop when
resolving the name of the second procedure.  It's also possible to create
circular dependency chains of more than two procedures.

I fixed this by adding the function HasCycle() to the class DeclarationVisitor
and calling it from DeclareProcEntity() to detect procedures with such
circularly defined interfaces.  I marked the associated symbols of such
procedures by calling SetError() on them.  When processing subsequent
procedures, I called HasError() before attempting to analyze their interfaces.
Unfortunately, this did not work.

With help from Tim, we determined that the SymbolSet used to track the
erroneous symbols was instantiated using a "<" operator which was defined using
the location of the name of the procedure.  But the location of the procedure
name was being changed by a call to ReplaceName() between the times that the
calls to SetError() and HasError() were made.  This caused HasError() to
incorrectly report that a symbol was not in the set of erroneous symbols.

I fixed this by changing SymbolSet to be an unordered set that uses the
contents of the name of the symbol as the basis for its hash function.  This
works because the contents of the name of the symbol is preserved by
ReplaceName() even though its location changes.

I also fixed the error message used when reporting recursively defined
dummy procedure arguments by removing extra apostrophes and sorting the
list of symbols.

I also added tests that will crash the compiler without this change.

Note that the "<" operator is used in other contexts, for example, in the map
of characterized procedures, maps of items in equivalence sets, maps of
structure constructor values, ...  All of these situations happen after name
resolution has been completed and all calls to ReplaceName() have already
happened and thus are not subject to the problem I ran into when ReplaceName()
was called when processing procedure entities.

Note also that the implementation of the "<" operator uses the relative
location in the cooked character stream as the basis of its implementation.
This is potentially problematic when symbols from diffent compilation units
(for example symbols originating in .mod files) are put into the same map since
their names will appear in two different source streams which may not be
allocated in the same relative positions in memory.  But I was unable to create
a test that caused a problem.  Using a direct comparison of the content of the
name of the symbol in the "<" operator has problems.  Symbols in enclosing or
parallel scopes can have the same name.  Also using the location of the symbol
in the cooked character stream has the advantage that it preserves the the
order of the symbols in a structure constructor constant, which makes matching
the values with the symbols relatively easy.

This patch supersedes D97749.

Differential Revision: https://reviews.llvm.org/D97774
2021-03-02 15:18:12 -08:00
Nico Weber 900f076113 hack to unbreak check-llvm on win after https://reviews.llvm.org/D97335
fix attempt http://reviews.llvm.org/rGbbdb4c8c9bcef0e didn't work

The problem is that the test tries to look up
llvm_orc_registerJITLoaderGDBWrapper from the llvm-jitlink.exe
executable, but the symbol wasn't exported. Just manually export it
for now. There's a FIXME with a suggestion for a real fix.
2021-03-02 18:10:28 -05:00
Kamlesh Kumar 5c3fc5093a [libunwind] [risc-v] This patch is for fixing
immediate build failure when Cross Unwinding enabled.
Follow up patch will cleanup some Macros handling.

Differential Revision: https://reviews.llvm.org/D97762
2021-03-03 04:32:47 +05:30
Hansang Bae b6c2f538b2 [OpenMP] Add allocator support for target memory
This is a preview of allocator support for target memory that depends on the
offload runtime API which allocates memory as described below.

llvm_omp_target_alloc_host(size_t size, int device_num);
-- Returns non-migratable memory owned by host.
-- Memory is accessible by host and device(s).

llvm_omp_target_alloc_shared(size_t size, int device_num);
-- Returns migratable memory owned by host and device.
-- Memory is accessible by host and device.

llvm_omp_target_alloc_device(size_t size, int device_num);
-- Returns memory owned by device.
-- Memory is only accessible by device.

New memory space and predefined allocator names are
-- llvm_omp_target_host_mem_space
-- llvm_omp_target_shared_mem_space
-- llvm_omp_target_device_mem_space
-- llvm_omp_target_host_mem_alloc
-- llvm_omp_target_shared_mem_alloc
-- llvm_omp_target_device_mem_alloc

Differential Revision: https://reviews.llvm.org/D96669
2021-03-02 16:45:12 -06:00
Christopher Di Bella eadece333f [libcxx] adds common_reference to <type_traits>
Implements part of P0898R3 Standard Library Concepts

Reworks D74351 to use requires-clauses over SFINAE and so that it more
closely follows the wording.

Co-authored by: Michael Schellenberger Costa <mschellenbergercosta@googlemail.com>

(Michael did all the heavy lifting and I came in to polish it for
 submission, since Michael is focussing on `std::format` now.)

Reviewed By: ldionne, #libc

Differential Revision: https://reviews.llvm.org/D96657
2021-03-02 22:33:37 +00:00
Matt Arsenault fd82cbcf7d GlobalISel: Merge and cleanup more AMDGPU call lowering code
This merges more AMDGPU ABI lowering code into the generic call
lowering. Start cleaning up by factoring away more of the pack/unpack
logic into the buildCopy{To|From}Parts functions. These could use more
improvement, and the SelectionDAG versions are significantly more
complex, and we'll eventually have to emulate all of those cases too.

This is mostly NFC, but does result in some minor instruction
reordering. It also removes some of the limitations with mismatched
sizes the old code had. However, similarly to the merge on the input,
this is forcing gfx6/gfx7 to use the gfx8+ ABI (which is what we
actually want, but SelectionDAG is stuck using the weird emergent
ABI).

This also changes the load/store size for stack passed EVTs for
AArch64, which makes it consistent with the DAG behavior.
2021-03-02 17:31:13 -05:00
Adrian Prantl 14ccba26bd Promote scalars to load addresses when dereferencing them.
This is a follow-up to 188b0747c1. This
is a very narrow fix to a more general problem. LLDB should be better
at distinguishing between implict and memory location descriptions.

rdar://74902042
2021-03-02 14:30:39 -08:00
Nikita Popov 29034f3876 [AST] Remove unused Loop member (NFC)
To fix some build bots after D89264.
2021-03-02 23:11:51 +01:00
Sam McCall bca3e24139 [clangd] Move DraftStore from ClangdLSPServer into ClangdServer.
ClangdServer already gets notified of every change, so it makes sense for it to
be the source of truth.
This is a step towards having ClangdServer expose a FS that includes dirty
buffers: D94554

Related changes:
 - version is now optional for ClangdServer, to preserve our existing fuzziness
   in this area (missing version ==> autoincrement)
 - ClangdServer::format{File,Range} are now more regular ClangdServer functions
   that don't need the code passed in. While here, combine into one function.
 - incremental content update logic is moved from DraftStore to
   ClangdLSPServer, with most of the implementation in SourceCode.cpp.
   DraftStore is now fairly trivial, and will probably ultimately be
   *replaced* by the dirty FS stuff.

Differential Revision: https://reviews.llvm.org/D97738
2021-03-02 22:58:50 +01:00
Nathan James 00c7d6699a
[cte][NFC] Remove all references to stdlib stream headers.
Inclusion of iostream is frobidden and using other stream classes from standard library is discouraged as per https://llvm.org/docs/CodingStandards.html#include-iostream-is-forbidden

Reviewed By: sammccall

Differential Revision: https://reviews.llvm.org/D97771
2021-03-02 21:57:16 +00:00
Markus Böck a7cad6680b [PR48898][CMake] Support MinGW Toolchain tool sin llvm_ExternalProject_Add
Windows is in the unique position of having two drivers, clang-cl and normal GNU clang, depending on whether a GNU or MSVC target is used. The current implementation with the USE_TOOLCHAIN argument assumes that when CMAKE_SYSTEM_NAME is set to Windows that clang-cl should be used, which is the incorrect choice when targeting a GNU environment.

This patch solves this problem by adding an optional TARGET_TRIPLE argument to llvm_ExternalProject_Add, which sets the various CMAKE_<LANG>_COMPILER_TARGET variables. Additionally, if the triple is detected as an MSVC environment, clang-cl and similar MSVC specific tools will be used instead of the GNU tools.
2021-03-02 22:45:05 +01:00
Heejin Ahn 4a58116b7e [WebAssembly] Fix more ExceptionInfo grouping bugs
This fixes two bugs in `WebAssemblyExceptionInfo` grouping, created by
D97247. These two bugs are not easy to split into two different CLs,
because tests that fail for one also tend to fail for the other.

- In D97247, when fixing `ExceptionInfo` grouping by taking out
  the unwind destination' exception from the unwind src's exception, we
  just iterated the BBs in the function order, but this was incorrect;
  this changes it to dominator tree preorder. Please refer to the
  comments in the code for the reason and an example.

- After this subexception-taking-out fix, there still can be remaining
  BBs we have to take out. When Exception B is taken out of Exception A
  (because EHPad B is the unwind destination of EHPad A), there can
  still be BBs within Exception A that are reachable from Exception B,
  which also should be taken out. Please refer to the comments in the
  code for more detailed explanation on why this can happen. To make
  this possible, this splits `WebAssemblyException::addBlock` into two
  parts: adding to a set and adding to a vector. We need to iterate on
  BBs within a `WebAssemblyException` to fix this, so we add BBs to sets
  first. But we add BBs to vectors later after we fix all incorrectness
  because deleting BBs from vectors is expensive. I considered removing
  the vector from `WebAssemblyException`, but it was not easy because
  this class has to maintain a similar interface with `MachineLoop` to
  be wrapped into a single interface `SortRegion`, which is used in
  CFGSort.

Other misc. drive-by fixes:
- Make `WebAssemblyExceptionInfo` do not even run when wasm EH is not
  used or the function doesn't have any EH pads, not to waste time
- Add `LLVM_DEBUG` lines for easy debugging
- Fix `preds` comments in cfg-stackify-eh.ll
- Fix `__cxa_throw`'s signature in cfg-stackify-eh.ll

Fixes https://github.com/emscripten-core/emscripten/issues/13554.

Reviewed By: dschuff, tlively

Differential Revision: https://reviews.llvm.org/D97677
2021-03-02 13:44:09 -08:00
Stephen Kelly 7b6fc9a105 [clang-tidy] Simplify unused RAII check
Fix handling of default construction where the constructor has a default arg.

Differential Revision: https://reviews.llvm.org/D97142
2021-03-02 21:33:34 +00:00
Nikita Popov 3d8f842712 [LICM] Make promotion faster
Even when MemorySSA-based LICM is used, an AST is still populated
for scalar promotion. As the AST has quadratic complexity, a lot
of time is spent in this step despite the existing access count
limit. This patch optimizes the identification of promotable stores.

The idea here is pretty simple: We're only interested in must-alias
mod sets of loop invariant pointers. As such, only populate the AST
with loop-invariant loads and stores (anything else is definitely
not promotable) and then discard any sets which alias with any of
the remaining, definitely non-promotable accesses.

If we promoted something, check whether this has made some other
accesses loop invariant and thus possible promotion candidates.

This is much faster in practice, because we need to perform AA
queries for O(NumPromotable^2 + NumPromotable*NumNonPromotable)
instead of O(NumTotal^2), and NumPromotable tends to be small.
Additionally, promotable accesses have loop invariant pointers,
for which AA is cheaper.

This has a signicant positive compile-time impact. We save ~1.8%
geomean on CTMark at O3, with 6% on lencod in particular and 25%
on individual files.

Conceptually, this change is NFC, but may not be so in practice,
because the AST is only an approximation, and can produce
different results depending on the order in which accesses are
added. However, there is at least no impact on the number of promotions
(licm.NumPromoted) in test-suite O3 configuration with this change.

Differential Revision: https://reviews.llvm.org/D89264
2021-03-02 22:10:48 +01:00
Yonghong Song 51cdb780db BPF: Fix a bug in peephole TRUNC elimination optimization
Andrei Matei reported a llvm11 core dump for his bpf program
   https://bugs.llvm.org/show_bug.cgi?id=48578
The core dump happens in LiveVariables analysis phase.
  #4 0x00007fce54356bb0 __restore_rt
  #5 0x00007fce4d51785e llvm::LiveVariables::HandleVirtRegUse(unsigned int,
      llvm::MachineBasicBlock*, llvm::MachineInstr&)
  #6 0x00007fce4d519abe llvm::LiveVariables::runOnInstr(llvm::MachineInstr&,
      llvm::SmallVectorImpl<unsigned int>&)
  #7 0x00007fce4d519ec6 llvm::LiveVariables::runOnBlock(llvm::MachineBasicBlock*, unsigned int)
  #8 0x00007fce4d51a4bf llvm::LiveVariables::runOnMachineFunction(llvm::MachineFunction&)
The bug can be reproduced with llvm12 and latest trunk as well.

Futher analysis shows that there is a bug in BPF peephole
TRUNC elimination optimization, which tries to remove
unnecessary TRUNC operations (a <<= 32; a >>= 32).
Specifically, the compiler did wrong transformation for the
following patterns:
   %1 = LDW ...
   %2 = SLL_ri %1, 32
   %3 = SRL_ri %2, 32
   ... %3 ...
   %4 = SRA_ri %2, 32
   ... %4 ...

The current transformation did not check how many uses of %2
and did transformation like
   %1 = LDW ...
   ... %1 ...
   %4 = SRL_ri %2, 32
   ... %4 ...
and pseudo register %2 is used by not defined and
caused LiveVariables analysis core dump.

To fix the issue, when traversing back from SRL_ri to SLL_ri,
check to ensure SLL_ri has only one use. Otherwise, don't
do transformation.

Differential Revision: https://reviews.llvm.org/D97792
2021-03-02 13:03:42 -08:00