Commit Graph

368648 Commits

Author SHA1 Message Date
Fangrui Song a8682554c6 [X86] Delete redundant 'static' from namespace scope 'static constexpr'. NFC
This decreases 7 lines as the result of packing more bits on one line.
2020-10-10 14:05:49 -07:00
Simon Pilgrim 702ccb40e2 [InstCombine] getLogBase2(undef) -> 0.
Move the undef element handling into the getLogBase2 helper instead of pre-empting with replaceUndefsWith.
2020-10-10 20:29:03 +01:00
Alex Denisov d0c8d58527 Fix CMake configuration error when run with -Werror/-Wall
The following code doesn't compile

  uint64_t i = x.load(std::memory_order_relaxed);
  return 0;

when CMAKE_C_FLAGS set to -Werror -Wall, thus incorrectly
breaking the CMake configuration step:

  -- Looking for __atomic_load_8 in atomic
  -- Looking for __atomic_load_8 in atomic - not found
  CMake Error at cmake/modules/CheckAtomic.cmake:79 (message):
    Host compiler appears to require libatomic for 64-bit operations, but
    cannot find it.
  Call Stack (most recent call first):
    cmake/config-ix.cmake:360 (include)
    CMakeLists.txt:671 (include)
2020-10-10 21:22:40 +02:00
Simon Pilgrim 3aab3cbd4a [InstCombine] getLogBase2 - no need to specify Type. NFCI.
In all the getLogBase2 uses, the specified Type is always the same as the constant being folded.
2020-10-10 20:09:55 +01:00
Simon Pilgrim f68d174c16 Remove %tmp variables from test cases to appease update_test_checks.py 2020-10-10 19:13:16 +01:00
Simon Pilgrim 2c3e4a21f9 [PowerPC] ReplaceNodeResults - bail on funnel shifts and let generic legalizers deal with it
Fixes regression raised on D88834 for 32-bit triple + 64-bit cpu cases (which apparently is a thing).
2020-10-10 19:13:16 +01:00
Krzysztof Parzyszek 4af6c6bf3c Define splat_vector for ISD::SPLAT_VECTOR in TargetSelectionDAG.td 2020-10-10 13:12:20 -05:00
Martin Storsjö 5d330f435e [lldb] [Windows] Remove unused functions. NFC.
These became unused in 51117e3c51.
2020-10-10 20:47:40 +03:00
Martin Storsjö abaca237c5 [lldb] [Windows] Add missing 'override', silencing warnings. NFC.
Also remove superfluous 'virtual' in overridden methods.
2020-10-10 20:47:40 +03:00
Simon Pilgrim f2e08c688e [PowerPC] Add ppc32 funnel shift test coverage 2020-10-10 18:19:42 +01:00
Simon Pilgrim 803b712330 [InstCombine] Add test case showing rotate intrinsic being split by SimplifyDemandedBits
Noticed while triaging regression report on D88834
2020-10-10 18:19:42 +01:00
Michał Górny 8dc2faf642 [lldb] [Process/FreeBSDRemote] Fix double semicolon 2020-10-10 18:54:52 +02:00
Michał Górny 9a37587ee3 [lldb] [Process/FreeBSDRemote] Kill process via PT_KILL
Use PT_KILL to kill the stopped process.  This ensures that the process
termination is reported properly and fixes delay/error on killing it.

Differential Revision: https://reviews.llvm.org/D89182
2020-10-10 18:54:05 +02:00
Michał Górny d83cd73e9d [lldb] [Process/FreeBSD] Mark methods override in RegisterContext*
Differential Revision: https://reviews.llvm.org/D89181
2020-10-10 18:52:23 +02:00
Philip Reames d89de5a14e Step down from security group
Resigning from security group as Azul representative as I have left Azul.  Previously communicated via email with security group.

Differential Revision: https://reviews.llvm.org/D88933
2020-10-10 09:48:02 -07:00
Tim Renouf 666ef0db20 [AMDGPU] Add gfx602, gfx705, gfx805 targets
At AMD, in an internal audit of our code, we found some corner cases
where we were not quite differentiating targets enough for some old
hardware. This commit is part of fixing that by adding three new
targets:

* The "Oland" and "Hainan" variants of gfx601 are now split out into
  gfx602. LLPC (in the GPUOpen driver) and other front-ends could use
  that to avoid using the shaderZExport workaround on gfx602.

* One variant of gfx703 is now split out into gfx705. LLPC and other
  front-ends could use that to avoid using the
  shaderSpiCsRegAllocFragmentation workaround on gfx705.

* The "TongaPro" variant of gfx802 is now split out into gfx805.
  TongaPro has a faster 64-bit shift than its former friends in gfx802,
  and a subtarget feature could be set up for that to take advantage of
  it. This commit does not make that change; it just adds the target.

V2: Add clang changes. Put TargetParser list in order.
V3: AMDGCNGPUs table in TargetParser.cpp needs to be in GPUKind order,
    so fix the GPUKind order.

Differential Revision: https://reviews.llvm.org/D88916

Change-Id: Ia901a7157eb2f73ccd9f25dbacec38427312377d
2020-10-10 17:22:22 +01:00
Florian Hahn d48b249b71 [SCEV] Add test cases where the max BTC is imprecise, due to step != 1.
Add a test case where we fail to compute a tight max backedge taken
count, due to the step being != 1.

This is part of the issue with PR40961.
2020-10-10 16:39:48 +01:00
Florian Hahn 2e9fd754b4 [SCEV] Handle ULE in applyLoopGuards.
Handle ULE predicate in similar fashion to ULT predicate in
applyLoopGuards.
2020-10-10 16:26:28 +01:00
Florian Hahn 2c6fc28aba [SCEV] Add a test case with ULE loop guard. 2020-10-10 15:58:26 +01:00
Nikita Popov 329dbdaaaf [MemCpyOpt] Add test for incorrect memset DSE (NFC)
We can't shorten the memset if there's a throwing call in between
and the destination is non-local.
2020-10-10 16:11:14 +02:00
David Green cb27006a94 [ARM] Attempt to make Tail predication / RDA more resilient to empty blocks
There are a number of places in RDA where we assume the block will not
be empty. This isn't necessarily true for tail predicated loops where we
have removed instructions. This attempt to make the pass more resilient
to empty blocks, not casting pointers to machine instructions where they
would be invalid.

The test contains a case that was previously failing, but recently been
hidden on trunk. It contains an empty block to begin with to show a
similar error.

Differential Revision: https://reviews.llvm.org/D88926
2020-10-10 14:50:25 +01:00
Alok Kumar Sharma 96bd4d34a2 [DebugInfo] Support for DWARF attribute DW_AT_rank
This patch adds support for DWARF attribute DW_AT_rank.

  Summary:
Fortran assumed rank arrays have dynamic rank. DWARF attribute
DW_AT_rank is needed to support that.

  Testing:
unit test cases added (hand-written)
check llvm
check debug-info

Reviewed By: aprantl

Differential Revision: https://reviews.llvm.org/D89141
2020-10-10 17:51:12 +05:30
Benjamin Kramer 0db08e59c9 [clangd] Map bits/stdint-intn.h and bits/stdint-uintn.h to cstdint.
These are private glibc headers containing parts of the implementation
of stdint.h.
2020-10-10 14:13:42 +02:00
David Green 6d8eea61b1 [AArch64][LV] Move vectorizer test to Transforms/LoopVectorize/AArch64. NFC 2020-10-10 10:15:43 +01:00
David Green f2741f2aee [TblGen][Scheduling] Fix debug output. NFC
This just moves some newlines to the expected places.
2020-10-10 10:04:28 +01:00
Tatiana Shpeisman 9909ef292d [mlir][scf] Fix a bug in scf::ForOp loop unroll with an epilogue
Fixes a bug in formation and simplification of an epilogue loop generated
during loop unroll of scf::ForOp (https://bugs.llvm.org/show_bug.cgi?id=46689)

Differential Revision: https://reviews.llvm.org/D87583
2020-10-10 14:18:25 +05:30
Nikita Popov 5e855f1e80 [MemCpyOpt] Don't hoist store that's not guaranteed to execute
MemCpyOpt can hoist stores while load+store pairs into memcpy.
This hoisting can currently result in stores being executed that
weren't guaranteed to execute in the original problem.

Differential Revision: https://reviews.llvm.org/D89154
2020-10-10 10:26:28 +02:00
Denis Antrushin 2b96dcebfa [Statepoints] Allow deopt GC pointer on VReg if gc-live bundle is empty.
Currently we allow passing pointers from deopt bundle on VReg only if
they were seen in list of gc-live pointers passed on VRegs.
This means that for the case of empty gc-live bundle we spill deopt
bundle's pointers. This change allows lowering deopt pointers to VRegs
in case of empty gc-live bundle. In case of non-empty gc-live bundle,
behavior does not change.

Reviewed By: skatkov

Differential Revision: https://reviews.llvm.org/D88999
2020-10-10 14:58:08 +07:00
Zi Xuan Wu e1c38dd55d [CSKY 1/n] Add basic stub or infra of csky backend
This patch introduce files that just enough for lib/Target/CSKY to compile.
Notably a basic CSKYTargetMachine and CSKYTargetInfo.

Differential Revision: https://reviews.llvm.org/D88466
2020-10-10 10:44:08 +08:00
Fangrui Song 2bd4730850 [PowerPC] Fix signed overflow in decomposeMulByConstant after D88201
Caught by multipliers LONG_MAX (after +1) and LONG_MIN (after -1) in CodeGen/PowerPC/mul-const-i64.ll
2020-10-09 18:29:12 -07:00
Xiang1 Zhang 0232f2d36d [X86] Add CET test, NFC 2020-10-10 09:13:04 +08:00
Valentin Clement 6260cb1d4d [mlir][openacc] Introduce acc.exit_data operation
This patch introduces the acc.exit_data operation that represents an OpenACC Exit Data directive.
Operands and attributes are derived from clauses in the spec 2.6.6.

Reviewed By: kiranchandramohan

Differential Revision: https://reviews.llvm.org/D88969
2020-10-09 21:02:56 -04:00
Sean Silva a2b6c75ac0 [mlir] Rename BufferPlacement.h to Bufferize.h
Context: https://llvm.discourse.group/t/what-is-the-strategy-for-tensor-memref-conversion-bufferization/1938/14

Differential Revision: https://reviews.llvm.org/D89174
2020-10-09 17:48:20 -07:00
Walter Erquinigo ea1f49741e [intel pt] Refactor parsing
With the feedback I was getting in different diffs, I realized that splitting the parsing logic into two classes was not easy to deal with. I do see value in doing that, but I'd rather leave that as a refactor after most of the intel-pt logic is in place. Thus, I'm merging the common parser into the intel pt one, having thus only one that is fully aware of Intel PT during parsing and object creation.

Besides, based on the feedback in https://reviews.llvm.org/D88769, I'm creating a ThreadIntelPT class that will be able to orchestrate decoding of its own trace and can handle the stop events correctly.

This leaves the TraceIntelPT class as an initialization class that glues together different components. Right now it can initialize a trace session from a json file, and in the future will be able to initialize a trace session from a live process.

Besides, I'm renaming SettingsParser to SessionParser, which I think is a better name, as the json object represents a trace session of possibly many processes.

With the current set of targets, we have the following

- Trace: main interface for dealing with trace sessions
- TraceIntelPT: plugin Trace for dealing with intel pt sessions
- TraceIntelPTSessionParser: a parser of a json trace session file that can create a corresponding TraceIntelPT instance along with Targets, ProcessTraces (to be created in https://reviews.llvm.org/D88769), and ThreadIntelPT threads.
- ProcessTrace: (to be created in https://reviews.llvm.org/D88769) can handle the correct state of the traces as the user traverses the trace. I don't think there'll be a need an intel-pt specific implementation of this class.
- ThreadIntelPT: a thread implementation that can handle the decoding of its own trace file, along with keeping track of the current position the user is looking at when doing reverse debugging.

Differential Revision: https://reviews.llvm.org/D88841
2020-10-09 17:32:04 -07:00
Aart Bik 3c366740ca [mlir] [standard] fixed typo in comment
There is an atomic_rmw and a generic_atomic_rmw operation.
The doc of the latter incorrectly referred to former though.

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D89172
2020-10-09 17:04:21 -07:00
Fangrui Song f6fa4d07dc [bugpoint] Delete -safe-llc and make -run-llc work like -run-llc -safe-run-llc 2020-10-09 16:38:30 -07:00
Stella Stamenova 09dbdcf15f [mlir, win] Mark several MLRI tests as unsupported on system-windows
They are currently marked as unsupported when windows is part of the triple, but they actually fail when they are run on Windows, so they are unsupported on system-windows

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D89169
2020-10-09 16:27:50 -07:00
Changpeng Fang f192a27ed3 Sink: Handle instruction sink when a user is dead
Summary:
  The current instruction sink pass uses findNearestCommonDominator of all users to find block to sink the instruction to.
However, a user may be in a dead block, which will result in unexpected behavior.

This patch handles such cases by skipping dead blocks. This patch fixes:
https://bugs.llvm.org/show_bug.cgi?id=47415

Reviewers:
  MaskRay, arsenm

Differential Revision:
  https://reviews.llvm.org/D89166
2020-10-09 16:20:26 -07:00
Joao Moreira e0b89df2e0 [X86] Check if call is indirect before emitting NT_CALL
The notrack prefix is a relaxation of CET policies which makes it possible to indirectly call targets which do not have an ENDBR instruction in the landing address. To emit a call with this prefix, the special attribute "nocf_check" is used. When used as a function attribute, a CallInst targeting the respective function will return true for the method "doesNoCfCheck()", no matter if it is a direct call (and such should remain like this, as the information that the to-be-called function won't perform control-flow checks is useful in other contexts). Yet, when emitting an X86ISD::NT_CALL, the respective CallInst should be verified for its indirection, allowing that the prefixed calls are only emitted in the right situations.

Update the respective testing unit to also verify for direct calls to functions with ''nocf_check'' attributes.

The bug can also be reproduced through compiling the following C code using the -fcf-protection=full flag.

int __attribute__((nocf_check)) foo(int a) {};

int main() {
  foo(42);
}

Differential Revision: https://reviews.llvm.org/D87320
2020-10-09 15:54:23 -07:00
Fangrui Song 488f1c4893 [X86][test] Add a regression test for lock cmpxchg16b on a global variable with offset
Add a test for a bug (uncovered by D88808) fixed by f34bb06935.
Also delete cmpxchg16b.ll which is covered by atomic128.ll

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D89163
2020-10-09 15:44:32 -07:00
Eli Friedman 278299b0f0 [SCCP] Reduce the number of times ResolvedUndefsIn is called for large modules.
If a module has many values that need to be resolved by
ResolvedUndefsIn, compilation takes quadratic time overall. Solve should
do a small amount of work, since not much is added to the worklists each
time markOverdefined is called. But ResolvedUndefsIn is linear over the
length of the function/module, so resolving one undef at a time is
quadratic in general.

To solve this, make ResolvedUndefsIn resolve every undef value at once,
instead of resolving them one at a time. This loses a little
optimization power, but can be a lot faster.

We still need a loop around ResolvedUndefsIn because markOverdefined
could change the set of blocks that are live. That should be uncommon,
hopefully. We could optimize it by tracking which blocks transition from
dead to live, instead of iterating over the whole module to find them.
But I'll leave that for later. (The whole function will become a lot
simpler once we start pruning branches on undef.)

The regression test changes seem minor. The specific cases in question
could probably be optimized with a bit more work, but they seem like
edge cases that don't really matter.

Fixes an "infinite" compile issue my team found on an internal workoad.

Differential Revision: https://reviews.llvm.org/D89080
2020-10-09 15:24:16 -07:00
Steven Wu 360f275cb7 [IRMover] Add missing open quote in the warning message
Fix the missing single quotation mark in the warning message for
target triple mismatch.
2020-10-09 15:17:16 -07:00
Jordan Rupprecht 9b5b305023 Temporarily revert "[ThinLTO] Re-order modules for optimal multi-threaded processing"
This reverts commit 6537004913. This is causing test failures internally, and while a few of the cases turned out to be bad user code (relying on a specific order of static initialization across translation units), some cases are less clear. Temporarily reverting for now, and Teresa is going to follow up with more details.
2020-10-09 14:36:20 -07:00
Thomas Lively d8f58bf53a [WebAssembly] Prototype i16x8.q15mulr_sat_s
This saturating, rounding, Q-format multiplication instruction is proposed in
https://github.com/WebAssembly/simd/pull/365.

Differential Revision: https://reviews.llvm.org/D88968
2020-10-09 21:17:53 +00:00
Louis Dionne 2dc9b26c00 [libc++] Remove code to prevent overwriting the system libc++ on Darwin
The system partition is read-only since Catalina.
2020-10-09 17:02:39 -04:00
Louis Dionne 4bd3d16c2d [libc++] Remove redundant if(LIBCXX_INSTALL_LIBRARY)
The individual LIBCXX_INSTALL_(SHARED|STATIC)_LIBRARY are already
dependent on whether LIBCXX_INSTALL_LIBRARY is ON or OFF.
2020-10-09 17:02:39 -04:00
Saleem Abdulrasool 5d74c43511 DirectoryWatcher: add an implementation for Windows
This implements the directory watcher on Windows.  It does the most
naive thing for simplicity.  ReadDirectoryChangesW is used to monitor
the changes.  However, in order to support interruption, we must use
overlapped IO, which allows us to use the blocking, synchronous
mechanism.  We create a thread to post the notification to the consumer
to allow the monitoring to continue.  The two threads communicate via a
locked queue.

Differential Revision: https://reviews.llvm.org/D88666
Reviewed By: Adrian McCarthy
2020-10-09 20:55:57 +00:00
Mircea Trofin c11c20fb00 [NFC][Regalloc] VirtRegAuxInfo::Hint does not need to be a field
It is only used in weightCalcHelper, and cleared upon its finishing its
job there.

The patch further cleans up style guide discrepancies, and simplifies
CopyHint by removing duplicate 'IsPhys' information (it's what the Reg
field would report).
2020-10-09 13:42:23 -07:00
Krzysztof Parzyszek 6fd994b4b7 [Hexagon] Remove ISD node VSPLATW, use VSPLAT instead
This is a step towards improving HVX codegen for splat.
2020-10-09 15:38:02 -05:00
Krzysztof Parzyszek 33bb3efbb3 [Hexagon] Generalize handling of SDNodes created during ISel
The selection of HVX shuffles can produce more nodes in the DAG,
which need special handling, or otherwise they would be left
unselected by the main selection code. Make the handling of such
nodes more general.
2020-10-09 15:38:02 -05:00