Commit Graph

368566 Commits

Author SHA1 Message Date
Sean Silva a2b6c75ac0 [mlir] Rename BufferPlacement.h to Bufferize.h
Context: https://llvm.discourse.group/t/what-is-the-strategy-for-tensor-memref-conversion-bufferization/1938/14

Differential Revision: https://reviews.llvm.org/D89174
2020-10-09 17:48:20 -07:00
Walter Erquinigo ea1f49741e [intel pt] Refactor parsing
With the feedback I was getting in different diffs, I realized that splitting the parsing logic into two classes was not easy to deal with. I do see value in doing that, but I'd rather leave that as a refactor after most of the intel-pt logic is in place. Thus, I'm merging the common parser into the intel pt one, having thus only one that is fully aware of Intel PT during parsing and object creation.

Besides, based on the feedback in https://reviews.llvm.org/D88769, I'm creating a ThreadIntelPT class that will be able to orchestrate decoding of its own trace and can handle the stop events correctly.

This leaves the TraceIntelPT class as an initialization class that glues together different components. Right now it can initialize a trace session from a json file, and in the future will be able to initialize a trace session from a live process.

Besides, I'm renaming SettingsParser to SessionParser, which I think is a better name, as the json object represents a trace session of possibly many processes.

With the current set of targets, we have the following

- Trace: main interface for dealing with trace sessions
- TraceIntelPT: plugin Trace for dealing with intel pt sessions
- TraceIntelPTSessionParser: a parser of a json trace session file that can create a corresponding TraceIntelPT instance along with Targets, ProcessTraces (to be created in https://reviews.llvm.org/D88769), and ThreadIntelPT threads.
- ProcessTrace: (to be created in https://reviews.llvm.org/D88769) can handle the correct state of the traces as the user traverses the trace. I don't think there'll be a need an intel-pt specific implementation of this class.
- ThreadIntelPT: a thread implementation that can handle the decoding of its own trace file, along with keeping track of the current position the user is looking at when doing reverse debugging.

Differential Revision: https://reviews.llvm.org/D88841
2020-10-09 17:32:04 -07:00
Aart Bik 3c366740ca [mlir] [standard] fixed typo in comment
There is an atomic_rmw and a generic_atomic_rmw operation.
The doc of the latter incorrectly referred to former though.

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D89172
2020-10-09 17:04:21 -07:00
Fangrui Song f6fa4d07dc [bugpoint] Delete -safe-llc and make -run-llc work like -run-llc -safe-run-llc 2020-10-09 16:38:30 -07:00
Stella Stamenova 09dbdcf15f [mlir, win] Mark several MLRI tests as unsupported on system-windows
They are currently marked as unsupported when windows is part of the triple, but they actually fail when they are run on Windows, so they are unsupported on system-windows

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D89169
2020-10-09 16:27:50 -07:00
Changpeng Fang f192a27ed3 Sink: Handle instruction sink when a user is dead
Summary:
  The current instruction sink pass uses findNearestCommonDominator of all users to find block to sink the instruction to.
However, a user may be in a dead block, which will result in unexpected behavior.

This patch handles such cases by skipping dead blocks. This patch fixes:
https://bugs.llvm.org/show_bug.cgi?id=47415

Reviewers:
  MaskRay, arsenm

Differential Revision:
  https://reviews.llvm.org/D89166
2020-10-09 16:20:26 -07:00
Joao Moreira e0b89df2e0 [X86] Check if call is indirect before emitting NT_CALL
The notrack prefix is a relaxation of CET policies which makes it possible to indirectly call targets which do not have an ENDBR instruction in the landing address. To emit a call with this prefix, the special attribute "nocf_check" is used. When used as a function attribute, a CallInst targeting the respective function will return true for the method "doesNoCfCheck()", no matter if it is a direct call (and such should remain like this, as the information that the to-be-called function won't perform control-flow checks is useful in other contexts). Yet, when emitting an X86ISD::NT_CALL, the respective CallInst should be verified for its indirection, allowing that the prefixed calls are only emitted in the right situations.

Update the respective testing unit to also verify for direct calls to functions with ''nocf_check'' attributes.

The bug can also be reproduced through compiling the following C code using the -fcf-protection=full flag.

int __attribute__((nocf_check)) foo(int a) {};

int main() {
  foo(42);
}

Differential Revision: https://reviews.llvm.org/D87320
2020-10-09 15:54:23 -07:00
Fangrui Song 488f1c4893 [X86][test] Add a regression test for lock cmpxchg16b on a global variable with offset
Add a test for a bug (uncovered by D88808) fixed by f34bb06935.
Also delete cmpxchg16b.ll which is covered by atomic128.ll

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D89163
2020-10-09 15:44:32 -07:00
Eli Friedman 278299b0f0 [SCCP] Reduce the number of times ResolvedUndefsIn is called for large modules.
If a module has many values that need to be resolved by
ResolvedUndefsIn, compilation takes quadratic time overall. Solve should
do a small amount of work, since not much is added to the worklists each
time markOverdefined is called. But ResolvedUndefsIn is linear over the
length of the function/module, so resolving one undef at a time is
quadratic in general.

To solve this, make ResolvedUndefsIn resolve every undef value at once,
instead of resolving them one at a time. This loses a little
optimization power, but can be a lot faster.

We still need a loop around ResolvedUndefsIn because markOverdefined
could change the set of blocks that are live. That should be uncommon,
hopefully. We could optimize it by tracking which blocks transition from
dead to live, instead of iterating over the whole module to find them.
But I'll leave that for later. (The whole function will become a lot
simpler once we start pruning branches on undef.)

The regression test changes seem minor. The specific cases in question
could probably be optimized with a bit more work, but they seem like
edge cases that don't really matter.

Fixes an "infinite" compile issue my team found on an internal workoad.

Differential Revision: https://reviews.llvm.org/D89080
2020-10-09 15:24:16 -07:00
Steven Wu 360f275cb7 [IRMover] Add missing open quote in the warning message
Fix the missing single quotation mark in the warning message for
target triple mismatch.
2020-10-09 15:17:16 -07:00
Jordan Rupprecht 9b5b305023 Temporarily revert "[ThinLTO] Re-order modules for optimal multi-threaded processing"
This reverts commit 6537004913. This is causing test failures internally, and while a few of the cases turned out to be bad user code (relying on a specific order of static initialization across translation units), some cases are less clear. Temporarily reverting for now, and Teresa is going to follow up with more details.
2020-10-09 14:36:20 -07:00
Thomas Lively d8f58bf53a [WebAssembly] Prototype i16x8.q15mulr_sat_s
This saturating, rounding, Q-format multiplication instruction is proposed in
https://github.com/WebAssembly/simd/pull/365.

Differential Revision: https://reviews.llvm.org/D88968
2020-10-09 21:17:53 +00:00
Louis Dionne 2dc9b26c00 [libc++] Remove code to prevent overwriting the system libc++ on Darwin
The system partition is read-only since Catalina.
2020-10-09 17:02:39 -04:00
Louis Dionne 4bd3d16c2d [libc++] Remove redundant if(LIBCXX_INSTALL_LIBRARY)
The individual LIBCXX_INSTALL_(SHARED|STATIC)_LIBRARY are already
dependent on whether LIBCXX_INSTALL_LIBRARY is ON or OFF.
2020-10-09 17:02:39 -04:00
Saleem Abdulrasool 5d74c43511 DirectoryWatcher: add an implementation for Windows
This implements the directory watcher on Windows.  It does the most
naive thing for simplicity.  ReadDirectoryChangesW is used to monitor
the changes.  However, in order to support interruption, we must use
overlapped IO, which allows us to use the blocking, synchronous
mechanism.  We create a thread to post the notification to the consumer
to allow the monitoring to continue.  The two threads communicate via a
locked queue.

Differential Revision: https://reviews.llvm.org/D88666
Reviewed By: Adrian McCarthy
2020-10-09 20:55:57 +00:00
Mircea Trofin c11c20fb00 [NFC][Regalloc] VirtRegAuxInfo::Hint does not need to be a field
It is only used in weightCalcHelper, and cleared upon its finishing its
job there.

The patch further cleans up style guide discrepancies, and simplifies
CopyHint by removing duplicate 'IsPhys' information (it's what the Reg
field would report).
2020-10-09 13:42:23 -07:00
Krzysztof Parzyszek 6fd994b4b7 [Hexagon] Remove ISD node VSPLATW, use VSPLAT instead
This is a step towards improving HVX codegen for splat.
2020-10-09 15:38:02 -05:00
Krzysztof Parzyszek 33bb3efbb3 [Hexagon] Generalize handling of SDNodes created during ISel
The selection of HVX shuffles can produce more nodes in the DAG,
which need special handling, or otherwise they would be left
unselected by the main selection code. Make the handling of such
nodes more general.
2020-10-09 15:38:02 -05:00
Christian Sigg 473b364a19 Add GPU async op interface and token type.
See https://llvm.discourse.group/t/rfc-new-dialect-for-modelling-asynchronous-execution-at-a-higher-level/1345

Reviewed By: herhut

Differential Revision: https://reviews.llvm.org/D88954
2020-10-09 22:37:13 +02:00
Nicolas Vasilache c303d9b394 [mlir][Linalg] NFC - Cleanup explicitly instantiated paterns 2/n - Loops.cpp
This revision belongs to a series of patches that reduce reliance of Linalg transformations on templated rewrite and conversion patterns.
Instead, this uses a MatchAnyTag pattern for the vast majority of cases and dispatches internally.

Differential revision: https://reviews.llvm.org/D89133
2020-10-09 19:59:49 +00:00
Nicolas Vasilache e0dc3dba3b [mlir][Linalg] NFC - Cleanup explicitly instantiated paterns 1/n - LinalgToStandard.cpp
This revision belongs to a series of patches that reduce reliance of Linalg transformations on templated rewrite and conversion patterns.
Instead, this uses a MatchAnyTag pattern for the vast majority of cases and dispatches internally.

Differential Revision: https://reviews.llvm.org/D89133
2020-10-09 19:41:41 +00:00
Nicolas Vasilache df295fac6c Revert "Give attributes C++ namespaces."
This reverts commit 0a34492f36.

This change turned out to be very intrusive wrt some internal projects.
Reverting until this can be sorted out.
2020-10-09 19:41:41 +00:00
Arthur Eubanks e4e23c55c0 [Reg2Mem][NewPM] Pin test to legacy PM
This pass hasn't been touched in a long time and isn't used in tree.
2020-10-09 12:36:08 -07:00
Vy Nguyen a2291a58bf Enable LSAN for Android
Make use of the newly added thread-properties API (available since 31).

    Differential Revision: https://reviews.llvm.org/D85927
2020-10-09 15:23:47 -04:00
Mircea Trofin 62e2ac6461 [NFC][Regalloc] Fix coding style in CalcSpillWeights 2020-10-09 12:22:12 -07:00
Stella Laurenzo e207927950 NFC: Address post-commit doc/formatting comments on TypeID.h. 2020-10-09 12:16:45 -07:00
Stella Laurenzo 0e9b572949 [mlir] Fix TypeID for shared libraries built with -fvisibility=hidden.
* Isolates the visibility controlled parts of its implementation to a detail namespace.
* Applies a struct level visibility attribute which applies to the static local within the get() functions.
* The prior version was not emitting a symbol for the static local "instance" fields when the user TU was compiled with -fvisibility=hidden.

Differential Revision: https://reviews.llvm.org/D89153
2020-10-09 12:12:34 -07:00
Scott Linder 40cef5a00e [clang] Add a test for CGDebugInfo treatment of blocks
There doesn't seem to be a direct test of this, and I'm planning to make
future changes which will affect it.

I'm not particularly familiar with the blocks extension, so suggestions
for better tests are welcome.

Differential Revision: https://reviews.llvm.org/D88754
2020-10-09 19:03:21 +00:00
Craig Topper f34bb06935 [X86] When expanding LCMPXCHG16B_NO_RBX in EmitInstrWithCustomInserter, directly copy address operands instead of going through X86AddressMode.
I suspect getAddressFromInstr and addFullAddress are not handling
all addresses cases properly based on a report from MaskRay.

So just copy the operands directly. This should be more efficient
anyway.
2020-10-09 11:55:24 -07:00
Craig Topper 662024df33 [X86] Don't copy kill flag when expanding LCMPXCHG16B_SAVE_RBX
The expansion code creates a copy to RBX before the real LCMPXCHG16B.
It's possible this copy uses a register that is also used by the
real LCMPXCHG16B. If we set the kill flag on the use in the copy,
then we'll fail the machine verifier on the use on the LCMPXCHG16B.

Differential Revision: https://reviews.llvm.org/D89151
2020-10-09 11:55:24 -07:00
Nikita Popov 466c8296f2 [MemCpyOpt] Add test for incorrectly hoisted store (NFC) 2020-10-09 20:52:08 +02:00
Louis Dionne 877667287f [libc++] Fixup a missing occurrence of LIBCXX_ENABLE_DEBUG_MODE 2020-10-09 14:40:47 -04:00
Louis Dionne e0d66ccf06 [libc++] Rename LIBCXX_ENABLE_DEBUG_MODE to LIBCXX_ENABLE_DEBUG_MODE_SUPPORT
To make it clearer this is about whether the library supports the debug
mode at all, not whether the debug mode is enabled. Per comment by Nico
Weber on IRC.
2020-10-09 14:39:20 -04:00
Louis Dionne 4abb519619 [libc++] NFCI: Define small methods of basic_stringstream inline
It greatly increases readability because defining the methods out-of-line
involves a ton of boilerplate template declarations.
2020-10-09 14:33:49 -04:00
Arthur Eubanks 2218e6d0a8 [BPF] Make BPFAbstractMemberAccessPass required
Or else on optnone functions we get the following during instruction selection:
  fatal error: error in backend: Cannot select: intrinsic %llvm.preserve.struct.access.index

Currently the -O0 pipeline doesn't properly run passes registered via
TargetMachine::registerPassBuilderCallbacks(), so don't add that RUN
line yet. That will be fixed after this.

Reviewed By: yonghong-song

Differential Revision: https://reviews.llvm.org/D89083
2020-10-09 11:26:37 -07:00
Simon Pilgrim 191fbda5d2 [ARM][MIPS] Add funnel shift test coverage
Based on offline discussions regarding D89139 and D88783 - we want to make sure targets aren't doing anything particularly dumb

Tests copied from aarch64 which has a mixture of general, legalization and special case tests
2020-10-09 19:19:47 +01:00
Jonas Devlieghere 5d501096ca [lldb] Update docs with new buildbot URLs
Buildbot got upgraded and now the (LLDB) builders have different URLs.
2020-10-09 10:57:39 -07:00
Giorgis Georgakoudis 3a6bfcf2f9 [OpenMPOpt] Merge parallel regions
There are cases that generated OpenMP code consists of multiple,
consecutive OpenMP parallel regions, either due to high-level
programming models, such as RAJA, Kokkos, lowering to OpenMP code, or
simply because the programmer parallelized code this way.  This
optimization merges consecutive parallel OpenMP regions to: (1) reduce
the runtime overhead of re-activating a team of threads; (2) enlarge the
scope for other OpenMP optimizations, e.g., runtime call deduplication
and synchronization elimination.

This implementation defensively merges parallel regions, only when they
are within the same BB and any in-between instructions are safe to
execute in parallel.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D83635
2020-10-09 09:59:04 -07:00
Louis Dionne c778f6c4f9 [libc++] Clean up logic around aligned/sized allocation and deallocation
Due to the need to support compilers that implement builtin operator
new/delete but not their align_val_t overloaded versions, there was a
lot of complexity. By assuming that a compiler that supports the builtin
new/delete operators also supports their align_val_t overloads, the code
can be simplified quite a bit.

Differential Revision: https://reviews.llvm.org/D88301
2020-10-09 12:43:28 -04:00
Louis Dionne a3a2431608 [clang] Don't look into <sysroot> for C++ headers if they are found alongside the toolchain
Currently, Clang looks for libc++ headers alongside the installation
directory of Clang, and it also adds a search path for headers in the
-isysroot. This is problematic if headers are found in both the toolchain
and in the sysroot, since #include_next will end up finding the libc++
headers in the sysroot instead of the intended system headers.

This patch changes the logic such that if the toolchain contains libc++
headers, no C++ header paths are added in the sysroot. However, if the
toolchain does *not* contain libc++ headers, the sysroot is searched as
usual.

This should not be a breaking change, since any code that previously
relied on some libc++ headers being found in the sysroot suffered from
the #include_next issue described above, which renders any libc++ header
basically useless.

Differential Revision: https://reviews.llvm.org/D89001
2020-10-09 12:41:41 -04:00
Louis Dionne 12805513a6 [libc++] Remove some workarounds for C++03
We don't support any compiler that doesn't support variadics and rvalue
references in C++03 mode, so these workarounds can be dropped. There's
still *a lot* of cruft related to these workarounds, but I try to tackle
a bit of it here and there.
2020-10-09 12:35:13 -04:00
Arthur Eubanks 0689dab844 [FixIrreducible][NewPM] Port -fix-irreducible to NPM
In the NPM, a pass cannot depend on another non-analysis pass. So pin
the test that tests that -lowerswitch is run automatically to legacy PM.

Reviewed By: sameerds

Differential Revision: https://reviews.llvm.org/D89051
2020-10-09 09:22:09 -07:00
Arthur Eubanks 9c21c6c966 [LoopInterchange][NewPM] Port -loop-interchange to NPM
Reviewed By: fhahn

Differential Revision: https://reviews.llvm.org/D89058
2020-10-09 09:21:31 -07:00
Jay Foad 1dfbc2ea14 [AMDGPU] Only enable mad/mac legacy f32 patterns if denormals may be flushed
Following on from D88890, this makes the newly added patterns
conditional on NoFP32Denormals. mad/mac f32 instructions always flush
denormals regardless of the MODE register setting, and I believe the
legacy variants do the same.

Differential Revision: https://reviews.llvm.org/D89123
2020-10-09 17:08:38 +01:00
Tres Popp 46dd827232 [mlir] Forward listeners when utilizing scf::IfOp::get*BodyBuilder.
Without this PatternRewriting infrastructure does not know of modifications and
cannot properly legalize nor rollback changes.

Differential Revision: https://reviews.llvm.org/D89129
2020-10-09 18:03:01 +02:00
Simon Pilgrim 8a836daaa9 [InstCombine] Support lshr(trunc(lshr(x,c1)), c2) -> trunc(lshr(lshr(x,c1),c2)) uniform vector tests
FoldShiftByConstant is hardcoded for scalar/uniform outer shift amounts atm so that needs to be fixed first to support non-uniform cases
2020-10-09 16:54:46 +01:00
Simon Pilgrim af1f016436 [InstCombine] Add lshr(trunc(lshr(x,c1)), c2) -> trunc(lshr(lshr(x,c1),c2)) vector tests 2020-10-09 16:54:46 +01:00
Eugene Zhulenev 4e69a52952 [MLIR] Add async token/value arguments to async.execute op
Async execute operation can take async arguments as dependencies.

Change `async.execute` custom parser/printer format to use `%value as %unwrapped: !async.value<!type>` sytax.

Reviewed By: mehdi_amini, herhut

Differential Revision: https://reviews.llvm.org/D88601
2020-10-09 08:52:27 -07:00
Andrzej Warzynski dcd9be43e5 [mlir] Fix shared libs build
Reverts one breaking change introduced in
https://reviews.llvm.org/D88846.

Differential Revision: https://reviews.llvm.org/D89111
2020-10-09 16:38:42 +01:00
David Green 4c3515cd62 [ARM] Add MVE vecreduce costmodel tests. NFC
There were some existing tests that were not super useful. New ones are
added for testing MVE specific patterns.
2020-10-09 16:25:25 +01:00