Commit Graph

378101 Commits

Author SHA1 Message Date
Hsiangkai Wang f19849a07b [RISCV] Update V extension to v1.0-draft 08a0b464.
Differential Revision: https://reviews.llvm.org/D94583
2021-01-26 12:02:43 +08:00
Hsiangkai Wang b69932b550 [RISCV] Implement vlsegff intrinsics.
Differential Revision: https://reviews.llvm.org/D95303
2021-01-26 12:02:43 +08:00
Kazu Hirata c85b6bf33c [AMDGPU] Forward-declare MachineIRBuilder (NFC)
AMDGPULegalizerInfo.h needs MachineIRBuilder but relies on a forward
declaration of MachineIRBuilder in LegalizerInfo.h.  This patch adds a
forward declaration right in AMDGPULegalizerInfo.h.

While we are at it, this patch removes the one in LegalizerInfo.h,
where it is unnecessary.
2021-01-25 19:24:01 -08:00
Kazu Hirata 772134e3ec [StackSafety] Use ListSeparator (NFC) 2021-01-25 19:23:59 -08:00
Kazu Hirata 5d3f3d3a05 [TableGen] Use llvm::append_range (NFC) 2021-01-25 19:23:58 -08:00
Shilei Tian 9d64275ae0 [OpenMP] Added the support for hidden helper task in RTL
The basic design is to create an outer-most parallel team. It is not a regular team because it is only created when the first hidden helper task is encountered, and is only responsible for the execution of hidden helper tasks.  We first use `pthread_create` to create a new thread, let's call it the initial and also the main thread of the hidden helper team. This initial thread then initializes a new root, just like what RTL does in initialization. After that, it directly calls `__kmpc_fork_call`. It is like the initial thread encounters a parallel region. The wrapped function for this team is, for main thread, which is the initial thread that we create via `pthread_create` on Linux, waits on a condition variable. The condition variable can only be signaled when RTL is being destroyed. For other work threads, they just do nothing. The reason that main thread needs to wait there is, in current implementation, once the main thread finishes the wrapped function of this team, it starts to free the team which is not what we want.

Two environment variables, `LIBOMP_NUM_HIDDEN_HELPER_THREADS` and `LIBOMP_USE_HIDDEN_HELPER_TASK`, are also set to configure the number of threads and enable/disable this feature. By default, the number of hidden helper threads is 8.

Here are some open issues to be discussed:
1. The main thread goes to sleeping when the initialization is finished. As Andrey mentioned, we might need it to be awaken from time to time to do some stuffs. What kind of update/check should be put here?

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D77609
2021-01-25 22:16:17 -05:00
Jon Chesterfield 357eea6e8b Revert "[libomptarget][cuda] Gracefully handle missing cuda library"
This reverts commit fafd45c01f.
2021-01-26 03:14:53 +00:00
Cassie Jones fdbfda2178 [Test][AArch64] Add s32 legalizer test for UADDE/USUBE
Differential Revision: https://reviews.llvm.org/D95324
2021-01-25 22:04:42 -05:00
Cassie Jones 2ba1f9c4e0 [Test][AArch64] Move overflow add/sub tests to their own file. NFC
Differential Revision: https://reviews.llvm.org/D95323
2021-01-25 22:02:31 -05:00
Lang Hames 236b0d0407 [JITLink] Disable ELF_ehframe_basic.s test on Windows.
This test is failing on some windows bots with an error claiming that it is not
producing output. This appears to be a spurious failure, so I'm disabling on
windows while we investigate rather than reverting.
2021-01-26 13:58:38 +11:00
Nemanja Ivanovic 8018f731f0 [PowerPC] Do not emit HW loop with half precision operations
If a loop has any operations on half precision values, there will be calls to
library functions on Power8. Even on Power9, there is a small subset of
instructions that are actually supported for the type.

This patch disables HW loops whenever any operations on the type are found
(other than the handfull of supported ones when compiling for Power9). Fixes a
few PR's opened by Julia:

https://bugs.llvm.org/show_bug.cgi?id=48785
https://bugs.llvm.org/show_bug.cgi?id=48786
https://bugs.llvm.org/show_bug.cgi?id=48519

Differential revision: https://reviews.llvm.org/D94980
2021-01-25 20:55:56 -06:00
Jon Chesterfield fafd45c01f [libomptarget][cuda] Gracefully handle missing cuda library
[libomptarget][cuda] Gracefully handle missing cuda library

If using dynamic cuda, and it failed to load, it is not safe to call
cuGetErrorString.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D95412
2021-01-26 02:54:00 +00:00
Wolfgang Pieb 231a82a150 [X86] Correct some cross references in avxintrin.h. 2021-01-25 18:49:28 -08:00
Sergey Dmitriev 13cedcaf45 [llvm-link] Fix crash when materializing appending global
This patch fixes llvm-link crash when materializing global variable
with appending linkage and initializer that depends on another
global with appending linkage.

Reviewed By: tra

Differential Revision: https://reviews.llvm.org/D95329
2021-01-25 18:08:07 -08:00
Richard Smith 8b11714885 Revert "Fix SBDebugger::CreateTargetWithFileAndArch to accept LLDB_ARCH_DEFAULT."
Also revert "Follow on to: f05dc40c31d1883b46b8bb60547087db2f4c03e3"

After these changes, multiple lldb tests are failing. Calls to
CreateTargetWithFileAndArch(None, None) appear to fail after these
changes.

This reverts commit f05dc40c31 and
1fba21778f.
2021-01-25 18:05:17 -08:00
Brad Smith 4b6d7fdd20 [libcxx] random_device, for OpenBSD specify optimal entropy properties
Reviewed By: ldionne

Differential Revision: https://reviews.llvm.org/D94571
2021-01-25 20:55:09 -05:00
Duncan P. N. Exon Smith f4d02fbe41 Frontend: Take VFS and MainFileBuffer by reference in PrecompiledPreamble::CanReuse, NFC
Clarify that `PrecompiledPreamble::CanReuse` requires non-null arguments
for `VFS` and `MainFileBuffer`, taking them by reference instead of by
pointer.

Differential Revision: https://reviews.llvm.org/D91297
2021-01-25 17:50:56 -08:00
Amara Emerson 03bce0bf4e [GlobalISel][Localizer] Don't localize phi operands which are used more than once in the phi.
The current algorithm just tries to localize defs as far as they can go, and in
the case of G_PHI operands, it clones the def into the predecessor block for
each incoming edge. When multiple edges have the same register value, this can
cause unnecessary code bloat, and inhibit later optimizations.

This change checks if a given phi operand is unique in the phi, if not the
def of that register is not localized to the predecessor.

Differential Revision: https://reviews.llvm.org/D95406
2021-01-25 17:48:04 -08:00
Julian Lettner 6f1d4fb8fc [lit] Update lit.py shebang for Python3
Update shebang to always use Python3 when executing `lit.py` directly.

A previous change of mine [1] revealed that we still use Python2 on some
bot configurations that invoke `llvm/utils/lit/lit.py` as a script
directly (instead of `python3 path/to/lit.py`).

[1] https://reviews.llvm.org/D94734

Differential Revision: https://reviews.llvm.org/D95393
2021-01-25 17:29:57 -08:00
Wolfgang Pieb 350395d82f [x86] Fix trivial typo in emmintrin.h 2021-01-25 17:28:05 -08:00
Duncan P. N. Exon Smith 8d67b9e246 SourceManager: Migrate to FileEntryRef in getOrCreateContentCache, NFC
Change `SourceManager::getOrCreateContentCache` to take a `FileEntryRef`
and update call sites (mostly internal to SourceManager.cpp). In a
couple of cases this temporarily relies on `FileEntry::getLastRef`, but
those can be cleaned up once other APIs switch over.

The one change outside of SourceManager.cpp is in ASTReader.cpp, which
stops relying on the auto-degrade-to-`FileEntry*` behaviour from
`InputFile::getFile` since it now needs a `FileEntryRef`.

No functionality change here.

Differential Revision: https://reviews.llvm.org/D92983
2021-01-25 17:03:12 -08:00
Duncan P. N. Exon Smith 46b1645e6c SourceManager: Unify FileEntry/FileEntryRef versions of createFileID
Change `SourceManager::createFileID(const FileEntry*)` to defer to
`SourceManager::createFileID(FileEntryRef)`. This fixes an unexercised
bug where the latter gained support for named pipes and the former
didn't, but since we're trying to remove all calls to the former it
doesn't really make sense to test this explicitly now that the
implementation is hollowed out.

This is a belated follow-up to 245218bb35,
which sunk named pipe support into FileManager and SourceManager. The
original version of that patch was based on top of
https://reviews.llvm.org/D92984, which removed the `FileEntry` overload
of `createFileID()`, and I missed the subtle difference when it was
rebased.
2021-01-25 17:03:12 -08:00
Lang Hames cda4d3d37f [JITLink] Re-apply 6884fbc2c4 (ELF eh support) with fix for broken test case. 2021-01-26 11:55:41 +11:00
Eugene Zhulenev d37b5393e8 [mlir:Async] Use LLVM coro operations in async.coro lowering
Instead of using llvm.call operations to call LLVM coro intrinsics use Coro operations from the LLVM dialect.

(This was reviewed as a part of https://reviews.llvm.org/D94923 but was lost in arc land from local branch)

Differential Revision: https://reviews.llvm.org/D95405
2021-01-25 16:42:11 -08:00
Duncan P. N. Exon Smith 8b6aedc4c9 ExpressionParser: Migrate to FileEntryRef in ParseInternal, NFC
Migrate to the `FileEntryRef` overload of `SourceManager::createFileID`
(using `FileManager::getOptionalFileRef`) in
`ClangExpressionParser::ParseInternal`.

No functionality change here.

Differential Revision: https://reviews.llvm.org/D92957
2021-01-25 16:38:42 -08:00
Craig Topper ea87cf2acd [TargetLowering][RISCV] Don't transform (seteq/ne (sext_inreg X, VT), C1) -> (seteq/ne (zext_inreg X, VT), C1) if the sext_inreg is cheaper
RISCV has to use 2 shifts for (i64 (zext_inreg X, i32)), but we
can use addiw rd, rs1, x0 for sext_inreg. We already understood this
when type legalizing i32 seteq/ne on rv64. But this transform in
SimplifySetCC would sometimes undo it.

Reviewed By: luismarques

Differential Revision: https://reviews.llvm.org/D95289
2021-01-25 16:37:21 -08:00
Arthur O'Dwyer f9b6fd269b [libc++] Support immovable return types in std::function.
LWG reflector consensus is that this was a bug in libc++.
(In particular, MSVC also will fix it in their STL, soon.)
Bug originally discovered by Logan Smith.

Also fix `std::function<const void()>`, which should work
the same way as `std::function<void()>` in terms of allowing
"conversions" from non-void types.

Differential Revision: https://reviews.llvm.org/D94452
2021-01-25 19:34:41 -05:00
David Blaikie 70e251497c DebugInfo: Generalize the .debug_addr minimization flag to pave the way for including other strategies 2021-01-25 16:24:35 -08:00
Mitch Phillips c9466ede7e Revert "Revert "[GlobalISel] LegalizerHelper - Extract widenScalarAddoSubo method""
This reverts commit 554b3211fe.

Differential Revision: https://reviews.llvm.org/D95035
2021-01-25 16:22:22 -08:00
Craig Topper 15f66cf749 [RISCV] Add isel patterns to optimize slli.uw patterns without Zba extension.
This pattern can occur when an unsigned is used to index an array
on RV64.

Reviewed By: luismarques

Differential Revision: https://reviews.llvm.org/D95290
2021-01-25 16:12:08 -08:00
Changpeng Fang 5b648df1a8 AMDGPU: Reduce the number of expensive calls in SIFormMemoryClause
Summary:
  RPTracker::reset(MI) is a very expensive call when the number of virtual registers is huge.
We observed a long compilation time issue when RPT::reset() is called once for each cluster.

In this work, we call RPT.reset() only at the first seen cluster, and use advance() to get
the register pressure for the later clusters in the same basic block. This could effectively reduce the number
of the expensive calls and thus reduce the compile time.

Reviewers:
  rampitec

Fixes:
  SWDEV-239161

Differential Revision:
  https://reviews.llvm.org/D95273
2021-01-25 16:08:08 -08:00
Jez Ng 3dd5ea9dd8 [lld-macho] Link against ObjCARCOpts instead of ObjCARC
Not sure what the difference is, but using the latter appears to cause
issues in standalone builds. See llvm.org/PR48853.

Reviewed By: #lld-macho, compnerd

Differential Revision: https://reviews.llvm.org/D95359
2021-01-25 19:01:48 -05:00
Mircea Trofin 91b61abafb [NFC] Disallow unused prefixes in clang/test/Analysis
Differential Revision: https://reviews.llvm.org/D95249
2021-01-25 15:53:00 -08:00
Stanislav Mekhanoshin eace81c48f [AMDGPU] Added -mcpu=tahiti to 3 tests. NFC. 2021-01-25 15:50:59 -08:00
Pedro Tammela 532e4203c5 [lldb/Lua] add support for Lua function breakpoint
Adds support for running a Lua function when a breakpoint is hit.

Example:
   breakpoint command add -s lua -F abc

The above runs the Lua function 'abc' passing 2 arguments. 'frame', 'bp_loc' and 'extra_args'.

A third parameter 'extra_args' is only present when there is structured data
declared in the command line.

Example:
   breakpoint command add -s lua -F abc -k foo -v bar

Differential Revision: https://reviews.llvm.org/D93649
2021-01-25 23:40:57 +00:00
modimo ce7f9cdb50 [InlineAdvisor] Allow replay of inline decisions for the CGSCC inliner from optimization remarks
This change leverages the work done in D83743 to replay in the SampleProfile inliner to also be used in the CGSCC inliner. NOTE: currently restricted to non-ML advisors only.

The added switch `-cgscc-inline-replay=<remarks file>` will replay the inlining decisions in that file where the remarks file is generated via `-Rpass=inline`. The aim here is to make it easier to analyze changes that would modify inlining heuristics to be separated from this behavior. Doing so allows easier examination of assembly and runtime behavior compared to the baseline rather than trying to dig through the large churn caused by inlining.

In LTO compilation, since inlining is done twice you can separately specify replay by passing the flag to the FE (`-cgscc-inline-replay=`) and to the linker (`-Wl,cgscc-inline-replay=`) with the remarks generated from their respective places.

Testing on mysqld by comparing the inline decisions between base (generates remarks.txt) and diff (replay using identical input/tools with remarks.txt) and examining the inlining sites with `diff` shows 14,000 mismatches out of 247,341 for a ~94% replay accuracy. I believe this gap can be narrowed further though for the general case we may never achieve full accuracy. For my personal use, this is close enough to be representative: I set the baseline as the one generated by the replay on identical input/toolset and compare that to my modified input/toolset using the same replay.

Testing:
ninja check-llvm
newly added test correctly replays CGSCC inlining decisions

Reviewed By: mtrofin, wenlei

Differential Revision: https://reviews.llvm.org/D94334
2021-01-25 15:38:57 -08:00
Shilei Tian 3333244d77 [OpenMP][deviceRTLs] Remove omp_is_initial_device
`omp_is_initial_device` in device code was implemented as a builtin
function in D38968 for a better performance. Therefore there is no chance that
this function will be called to `deviceRTLs`. As we're moving to build `deviceRTLs`
with OpenMP compiler, this function can lead to a compilation error. This patch
just simply removes it.

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D95397
2021-01-25 18:34:23 -05:00
Leonard Chan c0e94e9974 [clang][Fuchsia] Add relative-vtables + asan multilibs
We're choosing to take an opt-in approach for landing Relative VTables, so we'll
need asan-equivalent multilibs with relative vtables enabled. Afterwards, we can
just flip the switch in our build.

Differential Revision: https://reviews.llvm.org/D95253
2021-01-25 15:24:16 -08:00
Duncan P. N. Exon Smith 080952a944 Support: Remove duplicated code in {File,clang::ModulesDependency}Collector, NFC
Refactor the duplicated canonicalize-path logic in `FileCollector` and
`ModulesDependencyCollector` into a new utility called
`PathCanonicalizer` that's shared. This popped up when tracking down a
bug common to both in https://reviews.llvm.org/D95202.

As drive-bys, update a few names and comments to better reflect the
effect of the code, delay removal of `..`s to avoid an unnecessary extra
string copy, and leave behind a couple of FIXMEs for future
consideration.

Differential Revision: https://reviews.llvm.org/D95279
2021-01-25 15:09:00 -08:00
David Blaikie 6846686128 Fix -Wmissing-override in lldb 2021-01-25 15:04:21 -08:00
Walter Erquinigo 50337fb933 Fix runInTerminal errors on ARM
Caused by https://reviews.llvm.org/D93951

This feature is not needed on ARM, so let's just disable the tests on
ARM.
2021-01-25 14:55:15 -08:00
Harald van Dijk b43c26d036
Restore GNU , ## __VA_ARGS__ behavior in MSVC mode
As noted in D91913, MSVC implements the GNU behavior for
, ## __VA_ARGS__ as well. Do the same when `-fms-compatibility` is used.

Reviewed By: rsmith

Differential Revision: https://reviews.llvm.org/D95392
2021-01-25 22:34:49 +00:00
Jim Ingham 1fba21778f Follow on to: f05dc40c31
When you pass in a bogus ArchSpec, TargetList.CreateTarget
makes a target with the arch of the executable.  That wasn't the
case with a bogus triple, so this change caused one of the bogus
input data tests to fail.  So check that the ArchSpec is valid
before passing it to CreateTarget.
2021-01-25 14:25:51 -08:00
Sam McCall 118c33ef47 [clangd] Allow configuration database to be specified in config.
This allows for more flexibility than -compile-commands-dir or ancestor
discovery.

See https://github.com/clangd/clangd/issues/116

Differential Revision: https://reviews.llvm.org/D95057
2021-01-25 23:15:48 +01:00
Nikita Popov 835104a114 [LSR] Drop potentially invalid nowrap flags when switching to post-inc IV (PR46943)
When LSR converts a branch on the pre-inc IV into a branch on the
post-inc IV, the nowrap flags on the addition may no longer be valid.
Previously, a poison result of the addition might have been ignored,
in which case the program was well defined. After branching on the
post-inc IV, we might be branching on poison, which is undefined behavior.

Fix this by discarding nowrap flags which are not present on the SCEV
expression. Nowrap flags on the SCEV expression are proven by SCEV
to always hold, independently of how the expression will be used.
This is essentially the same fix we applied to IndVars LFTR, which
also performs this kind of pre-inc to post-inc conversion.

I believe a similar problem can also exist for getelementptr inbounds,
but I was not able to come up with a problematic test case. The
inbounds case would have to be addressed in a differently anyway
(as SCEV does not track this property).

Fixes https://bugs.llvm.org/show_bug.cgi?id=46943.

Differential Revision: https://reviews.llvm.org/D95286
2021-01-25 23:13:48 +01:00
Fraser Cormack 15141cd115 [RISCV] Add RVV insertelt/extractelt scalable-vector patterns
Original patch by @rogfer01.

This patch adds support for insertelt and extractelt operations on
scalable vectors.

Special care must be taken on RV32 when dealing with i64 vectors as
there are no straightforward ways to insert a 64-bit element without a
register of that size. To that end, both are custom-lowered to different
sequences.

Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: Fraser Cormack <fraser@codeplay.com>

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D94615
2021-01-25 22:03:52 +00:00
Walter Erquinigo 1ac36b34db Fix 0f0462cacf
This fails on Windows because std::future<llvm::Error> fail to compile.
Now switching to SBError as a workaround.

Failed buildbot: http://lab.llvm.org:8011/#/builders/83/builds/3021
2021-01-25 14:06:10 -08:00
Eugene Zhulenev 9c53b8e52e [mlir:Async] Add intermediate async.coro and async.runtime operations to simplify Async to LLVM lowering
[NFC] No new functionality, mostly a cleanup and one more abstraction level between Async and LLVM IR.

Instead of lowering from Async to LLVM coroutines and Async Runtime API in one shot, do it progressively via async.coro and async.runtime operations.

1. Lower from async to async.runtime/coro (e.g. async.execute to function with coro setup and runtime calls)
2. Lower from async.runtime/coro to LLVM intrinsics and runtime API calls

Intermediate coro/runtime operations will allow to run transformations on a higher level IR and do not try to match IR based on the LLVM::CallOp properties.

Although async.coro is very close to LLVM coroutines, it is not exactly the same API, instead it is optimized for usability in async lowering, and misses a lot of details that are present in @llvm.coro intrinsic.

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D94923
2021-01-25 14:04:33 -08:00
Quentin Chateau 3680cb99a7 [clangd] ignore parallelism level for quick tasks
This allows quick tasks without dependencies that
need to run fast to run ASAP. This is mostly useful
for code formatting.

----------------------------

This fixes something that's been annoying me:
- Open your IDE workspace and its 20 open files
- Clangd spends 5 minutes parsing it all
- In the meantime you start to work
- Save a file, trigger format-on-save, which hangs because clangd is busy
- You're stuck waiting until clangd is done parsing your files before the formatting and save takes place

Reviewed By: sammccall

Differential Revision: https://reviews.llvm.org/D94875
2021-01-25 23:03:43 +01:00
Cassie Jones aa8f3677f7 Recommit "[AArch64][GlobalISel] Implement widenScalar for signed overflow"
Implement widening for G_SADDO and G_SSUBO.
Add legalize-add/sub tests for narrow overflowing add/sub on AArch64.

Differential Revision: https://reviews.llvm.org/D95034
2021-01-25 16:57:20 -05:00