Commit Graph

390812 Commits

Author SHA1 Message Date
Valeriy Savchenko b6bcf95322 [analyzer] Change FindLastStoreBRVisitor to use Tracker
Additionally, this commit completely removes any uses of
FindLastStoreBRVisitor from the analyzer except for the
one in Tracker.

The next step is actually removing this class altogether
from the header file.

Differential Revision: https://reviews.llvm.org/D103618
2021-06-11 12:49:03 +03:00
Valeriy Savchenko 967c06b3e9 [analyzer] Reimplement trackExpressionValue as ExpressionHandler
This commit moves trackExpressionValue into the Tracker interface
as DefaultExpressionHandler.  It still can be split into smaller
handlers, but that can be a future change.

Additionally, this commit doesn't remove the original trackExpressionValue
interface, so it's not too big.  One of the next commits will address it.

Differential Revision: https://reviews.llvm.org/D103616
2021-06-11 12:49:03 +03:00
Valeriy Savchenko 0cc3100bf8 [analyzer] Introduce a new interface for tracking
Tracking values through expressions and the stores is fundamental
for producing clear diagnostics.  However, the main components
participating in this process, namely `trackExpressionValue` and
`FindLastStoreBRVisitor`, became pretty bloated.  They have an
interesting dynamic between them (and some other visitors) that
one might call a "chain reaction". `trackExpressionValue` adds
`FindLastStoreBRVisitor`, and the latter calls `trackExpressionValue`.

Because of this design, individual checkers couldn't affect what's
going to happen somewhere in the middle of that chain.  Whether they
want to produce a more informative note or keep the overall tracking
going by utilizing some of the domain expertise.  This all lead to two
biggest problems that I see:

  * Some checkers don't use it
  This should probably never be the case for path-sensitive checks.

  * Some checkers incorporated their logic directly into those
    components
  This doesn't make the maintenance easier, breaks multiple
  architecture principles, and makes the code harder to read adn
  understand, thus, increasing the probability of the first case.

This commit introduces a prototype for a new interface that will be
responsible for tracking.  My main idea here was to make operations
that I want have as a checker developer easy to implement and hook
directly into the tracking process.

Differential Revision: https://reviews.llvm.org/D103605
2021-06-11 12:49:03 +03:00
Roman Lebedev 20542b47d6
[VectorCombine] scalarizeLoadExtract(): use computeAlignmentAfterScalarization() helper
This results in slightly more optimistic alignments in some cases
2021-06-11 12:47:10 +03:00
Roman Lebedev abc0e0125c
[NFC][VectorCombine] Extract computeAlignmentAfterScalarization() helper function 2021-06-11 12:47:09 +03:00
Bing1 Yu 56d5c46b49 [X86] Support __tile_stream_loadd intrinsic for new AMX interface
Adding support for __tile_stream_loadd intrinsic.

Reviewed By: LuoYuanke

Differential Revision: https://reviews.llvm.org/D103784
2021-06-11 17:28:43 +08:00
Simon Pilgrim f0a68bbc96 SampleProf.h - fix spelling mistake in assert message. NFC. 2021-06-11 10:24:14 +01:00
Simon Pilgrim 5e6bfb661e [Analysis] Pass RecurrenceDescriptor as const reference. NFCI.
We were passing the RecurrenceDescriptor by value to most of the reduction analysis methods, despite it being rather bulky with TrackingVH members (that can be costly to copy). In all these cases we're only using the RecurrenceDescriptor for rather basic purposes (access to types/kinds etc.).

Differential Revision: https://reviews.llvm.org/D104029
2021-06-11 10:24:14 +01:00
Simon Pilgrim d789ed11ea Fix implicit dependency on <string> header. NFCI. 2021-06-11 10:24:14 +01:00
Sven van Haastregt ca964b40e6 [OpenCL][NFC] Reorganize ClangOpenCLBuiltinEmitter comments
Since 8866793b4e ("[OpenCL] Add OpenCL builtin test generator",
2021-06-09) there are two emitters in this file, so move the
file-level comment to the appropriate class.
2021-06-11 10:22:59 +01:00
Stephen Hines 6455418d3d [compiler-rt] [builtins] [AArch64] Add missing AArch64 data synchronization barrier (dsb) to __clear_cache
https://developer.arm.com/documentation/den0024/a/Caches/Cache-maintenance
covers how to properly clear caches on AArch64, and the builtin
implementation was missing a `dsb ish` after clearing the icache for the
selected range.

Reviewed By: kristof.beyls

Differential Revision: https://reviews.llvm.org/D104094
2021-06-11 02:13:48 -07:00
Ivan Murashko 47d138c939 [clang-tidy] LIT test fix for Remark diagnostic
There is a followup fix for a unit test introduced at D102906. The test file was placed into a temp folder and test assumed that it would be visible without the full path specification.

This behaviour can be changed in future and it would be good to specify full path to the file at the test.

Test Plan:
```
ninja check-clang-tools
```

Reviewed By: DmitryPolukhin

Differential Revision: https://reviews.llvm.org/D104021
2021-06-11 02:02:36 -07:00
Adrian Kuegel f98b779614 [mlir] Refactor ComplexOps.td [NFC]
Create a ComplexUnaryOp base class and use it for AbsOp, ReOp and ImOp.
Sort all ops in lexicographic order.

Differential Revision: https://reviews.llvm.org/D104095
2021-06-11 10:53:29 +02:00
LLVM GN Syncbot eac994e227 [gn build] Port c4a0969b9c 2021-06-11 08:23:07 +00:00
Sjoerd Meijer c4a0969b9c Function Specialization Pass
This adds a function specialization pass to LLVM. Constant parameters
like function pointers and constant globals are propagated to the callee by
specializing the function.

This is a first version with a number of limitations:
- The pass is off by default, so needs to be enabled on the command line,
- It does not handle specialization of recursive functions,
- It does not yet handle constants and constant ranges,
- Only 1 argument per function is specialised,
- The cost-model could be further looked into, and perhaps related,
- We are not yet caching analysis results.

This is based on earlier work by Matthew Simpson (D36432) and Vinay Madhusudan.
More recently this was also discussed on the list, see:

https://lists.llvm.org/pipermail/llvm-dev/2021-March/149380.html.

The motivation for this work is that function specialisation often comes up as
a reason for performance differences of generated code between LLVM and GCC,
which has this enabled by default from optimisation level -O3 and up. And while
this certainly helps a few cpu benchmark cases, this also triggers in real
world codes and is thus a generally useful transformation to have in LLVM.

Function specialisation has great potential to increase compile-times and
code-size.  The summary from some investigations with this patch is:
- Compile-time increases for short compile jobs is high relatively, but the
  increase in absolute numbers still low.
- For longer compile-jobs, the extra compile time is around 1%, and very much
  in line with GCC.
- It is difficult to blame one thing for compile-time increases: it looks like
  everywhere a little bit more time is spent processing more functions and
  instructions.
- But the function specialisation pass itself is not very expensive; it doesn't
  show up very high in the profile of the optimisation passes.

The goal of this work is to reach parity with GCC which means that eventually
we would like to get this enabled by default. But first we would like to address
some of the limitations before that.

Differential Revision: https://reviews.llvm.org/D93838
2021-06-11 09:11:29 +01:00
Petr Hosek 22f194909a Revert "[Driver] Support libc++ in MSVC"
This reverts commit 9625d61eb6 since
libc++ currently has issues with disabled exceptions which breaks
the runtimes build.
2021-06-11 00:45:56 -07:00
Petr Hosek 0d5af7a4ca Revert "[CMake] Don't use libc++ by default on Windows yet"
This reverts commit b413e44200.
2021-06-11 00:45:49 -07:00
Vitaly Buka f3f904563e [lldb] Fix leak in test
Test leaks if we run
tools/lldb/unittests/Host/HostTests without --gtest_filter

Reviewed By: teemperor

Differential Revision: https://reviews.llvm.org/D104091
2021-06-11 00:20:35 -07:00
Qiu Chaofan bc104fdcec [PowerPC] Relax register superclasses for paired memops
Relaxing superclass constraint for VSX register classes helps reducing
32-byte spills and copies when register pressure is high.

In test case affected, some of them introduces more copies due to new
allocation order. However, this patch should not be the root cause, and
we may be able to fix it in other places of register allocation.

Reviewed By: nemanjai

Differential Revision: https://reviews.llvm.org/D104006
2021-06-11 14:54:03 +08:00
Raphael Isemann 632cbcac79 [lldb] Move once_flags in HostInfoLinux so the internal state struct
The HostInfoLinuxFields struct is supposed to be set up/torn down on
Initialize/Terminate and should contain all the state of the plugin.
`once_flags` are part of this state and should also be reset on `Terminate` so
we can re-initialize these lazy values after the next `Initialize` call.

This itself is NFC as the HostInfoLinux was broken before this patch and is
still broken afterwards. D104091 will be the proper fix.

Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D104093
2021-06-11 08:53:38 +02:00
Hsiangkai Wang 643b6407fa [RISCV] Avoid scalar outgoing argumetns overwriting vector frame objects.
When using FP to access stack objects, the scalable stack objects will
be put at the lower end of the frame. It looks like

```
|-------------------|  <-- FP
| callee-saved regs |
|-------------------|
| scalar local vars |
|-------------------|
| RVV local vars    |
|-------------------|  <-- SP
```

If there are scalar arguments that need to pass through memory and there
are vector objects on the stack using FP to access. The outgoing scalar
arguments will overwrite the vector objects. It looks like

```
|-------------------|  <-- FP
| callee-saved regs |
|-------------------|
| scalar local vars |
|-------------------|         |-------------------|
| RVV local vars    |         | outgoing args     | <- outgoing arguments
|-------------------|  <-- SP |-------------------|    overwrite from here.
```

In this patch, we reserve the stack for the outgoing arguments before
function calls if using FP to access and there are scalable vector frame
objects. It looks like

```
|-------------------|  <-- FP
| callee-saved regs |
|-------------------|
| scalar local vars |
|-------------------|
| RVV local vars    |
|-------------------|
| outgoing args     |
|-------------------|  <-- SP
```

Differential Revision: https://reviews.llvm.org/D103622
2021-06-11 12:26:29 +08:00
Nico Weber 54418c5a35 [lld/mac] Make binaries written by lld strippable
Be less clever when writing the indirect symbols in LC_DYSYMTAB:
lld used to make point __stubs and __la_symbol_ptr point at the
same bytes in the indirect symbol table in the __LINKEDIT segment.
That confused strip, so write the same bytes twice and make
__stubs and __la_symbol_ptr point at one copy each, so that they
don't share data. This unconfuses strip, and seems to be what ld64
does too, so hopefully tools are generally more used to this.

This makes the output binaries a bit larger, but not much: 4 bytes
for roughly each called function from a dylib and each weak function.
Chromium Framewoork grows by 6536 bytes, clang-format by a few hundred.

With this, `strip -x Chromium\ Framework` works (244 MB before stripping
to 171 MB after stripping, compared to 236 MB=>164 MB with ld64). Running
strip without `-x` produces the same error message now for lld-linked
Chromium Framework as for when using ld64 as a linker.

`strip clang-format` also works now but didn't previously.

Fixes PR50657.

Differential Revision: https://reviews.llvm.org/D104081
2021-06-11 00:18:03 -04:00
Craig Topper 081ae5fe1a [RISCV] Remove extra assignment of intrinsic ID in ManualCodegen. NFC
There's already an autogenerated assignment.

Fixes static analyzer warning reported in PR50593.
2021-06-10 20:46:34 -07:00
Arthur Eubanks 85ca7e424f Revert "[clang] NRVO: Improvements and handling of more cases."
This reverts commit 667fbcdd0b.

Causes crashes on a stage 2 build on Windows.
2021-06-10 20:37:01 -07:00
Arthur Eubanks db26615aa6 Revert "[clang] Implement P2266 Simpler implicit move"
This reverts commit cbd0054b9e.
2021-06-10 19:54:50 -07:00
Qiu Chaofan 2670c7dd5b [VectorCombine] Fix alignment in single element store
This fixes the concern in single element store scalarization that the
alignment of new store may be larger than it should be. It calculates
the largest alignment if index is constant, and a safe one if not.

Reviewed By: lebedev.ri, spatel

Differential Revision: https://reviews.llvm.org/D103419
2021-06-11 10:28:15 +08:00
Craig Topper 420bd5ee8e [RISCV] Use ComputeNumSignBits/MaskedValueIsZero in RISCVDAGToDAGISel::selectSExti32/selectZExti32.
This helps us select W instructions in more cases. Most of the
affected tests have had the sign_extend_inreg or AND folded into
sextload/zextload.

Differential Revision: https://reviews.llvm.org/D104079
2021-06-10 19:06:45 -07:00
Michael Kruse 7836d058c7 [Flang] Compile fix after D99459.
Fix Flang build after addition of a new OpenMP clauses for a Clang
patch (D99459). Flang is using TableGen to generation the declaration
of clause checks and the new clause was missing a definiton.
2021-06-10 21:01:09 -05:00
River Riddle 8800047707 [mlir-ir-printing] Prefix the dump message with the split marker(// -----)
This allows for better interaction with tools (such as mlir-lsp-server), as it separates the IR into separate modules for consecutive dumps.

Differential Revision: https://reviews.llvm.org/D104073
2021-06-10 17:34:50 -07:00
River Riddle c42dd5dbb0 [mlir] Add new SubElementAttr/SubElementType Interfaces
These interfaces allow for a composite attribute or type to opaquely provide access to any held attributes or types. There are several intended use cases for this interface. The first of which is to allow the printer to create aliases for non-builtin dialect attributes and types. In the future, this interface will also be extended to allow for SymbolRefAttr to be placed on other entities aside from just DictionaryAttr and ArrayAttr.

To limit potential test breakages, this revision only adds the new interfaces to the builtin attributes/types that are currently hardcoded during AsmPrinter alias generation. In a followup the remaining builtin attributes/types, and non-builtin attributes/types can be extended to support it.

Differential Revision: https://reviews.llvm.org/D102945
2021-06-10 17:23:07 -07:00
River Riddle f8a1d652da [mlir][IR] Move MemRefElementTypeInterface to a new BuiltinTypeInterfaces file
This allows for using other type interfaces in the builtin dialect, which currently results in a compile time failure (as it generates duplicate interface declarations).
2021-06-10 17:23:06 -07:00
Amara Emerson 670edf3ee0 [AArch64][GlobalISel] Fix incorrectly generating uxtw/sxtw for addressing modes.
When the extend is from 8 or 16 bits, the addressing modes don't support those
extensions, but we weren't checking that and therefore always generated the 32->64b
extension mode. Fun.

Differential Revision: https://reviews.llvm.org/D104070
2021-06-10 16:59:39 -07:00
Carl Ritson 2c2d2922a2 [ValueTypes] Define MVTs for v6i32, v6f32, v7i32, v7f32
For use in AMDGPU selection DAG.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D103881
2021-06-11 08:58:16 +09:00
Carl Ritson cfbb92441f [SDAG] Fix pow2 assumption when splitting vectors
When reducing vector builds to shuffles it possible that
the DAG combiner may try to extract invalid subvectors.

This happens as the existing code assumes vectors will be power
of 2 sizes, which is already untrue, but becomes more noticable
with v6 and v7 types.
Specifically the existing code assumes that half PowerOf2Ceil of
a given vector index will fit twice into a given vector.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D103880
2021-06-11 08:58:16 +09:00
Craig Topper b35a842581 [RISCV] Add test cases that show failure to use some W instructions if they are proceeded by a load. NFC
The loads end up becoming sextload/zextload which prevent our
isel patterns from finding the sign_extend_inreg or AND instruction
we need.

The easiest way to fix this is to use computeKnownBits or
ComputeNumSignBits in our isel matching to catch this.
2021-06-10 16:55:49 -07:00
Sami Tolvanen ffaca140d0 [IR] Value: Fix OpCode checks
Value::SubclassID cannot be directly compared to Instruction enums, such as
Instruction::{Call,Invoke,CallBr}. We have to first subtract InstructionVal
from the SubclassID to get the OpCode, similar to Instruction::getOpCode().

Reviewed By: nickdesaulniers

Differential Revision: https://reviews.llvm.org/D104043
2021-06-10 16:46:33 -07:00
Wolfgang Pieb 5a1589fc6d [static initializers] Emit global_ctors and global_dtors in reverse order when .ctors/.dtors are used.
Reviewed By: rnk, MaskRay, efriedma

Differential Revision: https://reviews.llvm.org/D103495
2021-06-10 16:44:47 -07:00
Slava Nikolaev 119965865c LoadStoreVectorizer: support different operand orders in the add sequence match
First we refactor the code which does no wrapping add sequences
match: we need to allow different operand orders for
the key add instructions involved in the match.

Then we use the refactored code trying 4 variants of matching operands.

Originally the code relied on the fact that the matching operands
of the two last add instructions of memory index calculations
had the same LHS argument. But which operand is the same
in the two instructions is actually not essential, so now we allow
that to be any of LHS or RHS of each of the two instructions.
This increases the chances of vectorization to happen.

Reviewed By: volkan

Differential Revision: https://reviews.llvm.org/D103912
2021-06-10 16:31:35 -07:00
Arthur Eubanks b73742bc8d [Profile] Remove redundant check
This is already checked outside the loop.

Followup to D104050.
2021-06-10 16:24:53 -07:00
Nick Desaulniers fc018ebb60 [IR] make -warn-frame-size into a module attr
-Wframe-larger-than= is an interesting warning; we can't know the frame
size until PrologueEpilogueInsertion (PEI); very late in the compilation
pipeline.

-Wframe-larger-than= was propagated through CC1 as an -mllvm flag, then
was a cl::opt in LLVM's PEI pass; this meant it was dropped during LTO
and needed to be re-specified via -plugin-opt.

Instead, make it part of the IR proper as a module level attribute,
similar to D103048. Introduce -fwarn-stack-size CC1 option.

Reviewed By: rsmith, qcolombet

Differential Revision: https://reviews.llvm.org/D103928
2021-06-10 16:15:27 -07:00
Arthur Eubanks 189428c8fc [Profile] Handle invalid profile data
This mostly follows LLVM's InstrProfReader.cpp error handling.
Previously, attempting to merge corrupted profile data would result in
crashes. See https://crbug.com/1216811#c4.

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D104050
2021-06-10 16:10:13 -07:00
Matheus Izvekov cbd0054b9e [clang] Implement P2266 Simpler implicit move
This Implements [[http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2021/p2266r1.html|P2266 Simpler implicit move]].

Signed-off-by: Matheus Izvekov <mizvekov@gmail.com>

Reviewed By: Quuxplusone

Differential Revision: https://reviews.llvm.org/D99005
2021-06-11 00:56:06 +02:00
Andy Kaylor 41555eaf65 Preserve more MD_mem_parallel_loop_access and MD_access_group in SROA
SROA sometimes preserves MD_mem_parallel_loop_access and MD_access_group metadata on loads/stores, and sometimes fails to do so. This change adds copying of the MD after other CreateAlignedLoad/CreateAlignedStores. Also fix a case where the metadata was being copied from a load, rather than the store.

Added a LIT test to catch one case.

Patch by Mark Mendell

Differential Revision: https://reviews.llvm.org/D103254
2021-06-10 15:47:03 -07:00
Christopher Di Bella 462f8f0611 [libcxx][ranges] removes default_initializable from weakly_incrementable and view
also:

* removes default constructors from predefined iterators
* makes span and string_view views

Partially implements P2325.
Partially resolves LWG3326.

Differential Revision: https://reviews.llvm.org/D102468
2021-06-10 22:45:36 +00:00
Jessica Paquette 933df6ca79 [AArch64][GlobalISel] Legalize scalar G_CTTZ + G_CTTZ_ZERO_UNDEF
This adds legalization for scalar G_CTTZ and G_CTTZ_ZERO_UNDEF. Vector support
requires handling vector G_BITREVERSE, which I haven't gotten around to yet.

For G_CTTZ_ZERO_UNDEF, we just lower it to G_CTTZ.

For G_CTTZ, we match SelectionDAG's lowering to a G_BITREVERSE + G_CTLZ.

e.g. https://godbolt.org/z/nPEseYh1s

(With this patch, we have slightly worse codegen than SDAG for types smaller
than s32; it seems like we're missing a combine.)

Also, this adds in a function to build G_BITREVERSE to MachineIRBuilder.

Differential Revision: https://reviews.llvm.org/D104065
2021-06-10 15:29:51 -07:00
Geoffrey Martin-Noble 4f6ec382c8 [MLIR] Document that Dialect Conversion traverses in preorder
Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D102525
2021-06-10 15:08:56 -07:00
Benoit Jacob 20daedacca 2d Arm Neon sdot op, and lowering to the intrinsic.
This adds Sdot2d op, which is similar to the usual Neon
intrinsic except that it takes 2d vector operands, reflecting the
structure of the arithmetic that it's performing: 4 separate
4-dimensional dot products, whence the vector<4x4xi8> shape.

This also adds a new pass, arm-neon-2d-to-intr, lowering
this new 2d op to the 1d intrinsic.

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D102504
2021-06-10 14:36:39 -07:00
Joachim Meyer 4f01122c3f [LV] Parallel annotated loop does not imply all loads can be hoisted.
As noted in https://bugs.llvm.org/show_bug.cgi?id=46666, the current behavior of assuming if-conversion safety if a loop is annotated parallel (`!llvm.loop.parallel_accesses`), is not expectable, the documentation for this behavior was since removed from the LangRef again, and can lead to invalid reads.
This was observed in POCL (https://github.com/pocl/pocl/issues/757) and would require similar workarounds in current work at hipSYCL.

The question remains why this was initially added and what the implications of removing this optimization would be.
Do we need an alternative mechanism to propagate the information about legality of if-conversion?
Or is the idea that conditional loads in `#pragma clang loop vectorize(assume_safety)` can be executed unmasked without additional checks flawed in general?
I think this implication is not part of what a user of that pragma (and corresponding metadata) would expect and thus dangerous.

Only two additional tests failed, which are adapted in this patch. Depending on the further direction force-ifcvt.ll should be removed or further adapted.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D103907
2021-06-10 23:37:57 +02:00
Sanjay Patel 7b969ef8b4 [SimplifyCFG] avoid 'tmp' variables in test file; NFC 2021-06-10 17:04:23 -04:00
Matheus Izvekov 667fbcdd0b [clang] NRVO: Improvements and handling of more cases.
This expands NRVO propagation for more cases:

Parse analysis improvement:
* Lambdas and Blocks with dependent return type can have their variables
  marked as NRVO Candidates.

Variable instantiation improvements:
* Fixes crash when instantiating NRVO variables in Blocks.
* Functions, Lambdas, and Blocks which have auto return type have their
  variables' NRVO status propagated. For Blocks with non-auto return type,
  as a limitation, this propagation does not consider the actual return
  type.

This also implements exclusion of VarDecls which are references to
dependent types.

Signed-off-by: Matheus Izvekov <mizvekov@gmail.com>

Reviewed By: Quuxplusone

Differential Revision: https://reviews.llvm.org/D99696
2021-06-10 23:02:51 +02:00