Commit Graph

402256 Commits

Author SHA1 Message Date
Zhi An Ng 2542bfa43a [WebAssembly] Add prototype relaxed swizzle instructions
Add i8x16 relaxed_swizzle instructions. These are only
exposed as builtins, and require user opt-in.

Differential Revision: https://reviews.llvm.org/D112022
2021-10-19 17:53:04 -07:00
Shafik Yaghmour 320f65ee65 [LLDB][NFC] Remove parameter names from forward declarations from hand written expressions used in heap.py part 2
heap.py has a lot of large hand written expressions and each name in the
expression will be looked up by clang during expression parsing. For
function parameters this will be in Sema::ActOnParamDeclarator(...) in order to
catch redeclarations of parameters. The names are not needed and we have seen
some rare cases where since we don't have symbols we end up in
SymbolContext::FindBestGlobalDataSymbol(...) which may conflict with other global
symbols.

There may be a way to make this lookup smarter to avoid these cases but it is
not clear how well tested this path is and how much work it would be to fix it.
So we will go with this fix while we investigate more.

This is a second try at getting all the cases we care about.

Ref: rdar://78265641
2021-10-19 16:52:36 -07:00
Yonghong Song cd40b5a712 BPF: set .BTF and .BTF.ext section alignment to 4
Currently, .BTF and .BTF.ext has default alignment of 1.
For example,
  $ cat t.c
    int foo() { return 0; }
  $ clang -target bpf -O2 -c -g t.c
  $ llvm-readelf -S t.o
    ...
    Section Headers:
    [Nr] Name              Type            Address          Off    Size   ES Flg Lk Inf Al
    ...
    [ 7] .BTF              PROGBITS        0000000000000000 000167 00008b 00      0   0  1
    [ 8] .BTF.ext          PROGBITS        0000000000000000 0001f2 000050 00      0   0  1

But to have no misaligned data access, .BTF and .BTF.ext
actually requires alignment of 4. Misalignment is not an issue
for architecture like x64/arm64 as it can handle it well. But
some architectures like mips may incur a trap if .BTF/.BTF.ext
is not properly aligned.

This patch explicitly forced .BTF and .BTF.ext alignment to be 4.
For the above example, we will have
    [ 7] .BTF              PROGBITS        0000000000000000 000168 00008b 00      0   0  4
    [ 8] .BTF.ext          PROGBITS        0000000000000000 0001f4 000050 00      0   0  4

Differential Revision: https://reviews.llvm.org/D112106
2021-10-19 16:26:01 -07:00
Artem Belevich b6b7fe60a4 [NVPTX] Add a late SROA pass which allows optimizing away more allocas.
Fixes performance regression https://bugs.llvm.org/show_bug.cgi?id=52037

Differential Revision: https://reviews.llvm.org/D111471
2021-10-19 16:18:28 -07:00
Stella Laurenzo a897590f11 Add MLIR_INSTALL_AGGREGATE_OBJECTS and default it to ON.
* Package maintainers can opt to disable installation of these objects.
* Per discussion on https://reviews.llvm.org/D111504

Differential Revision: https://reviews.llvm.org/D112090
2021-10-19 16:14:04 -07:00
Kojo Acquah 9c62bb55f4 Implementation of `ReshapeNoopOptimization` canonicalizer.
This canonicalizer replaces reshapes of constant tensors that contain the updated shape (skipping the reshape operation).

Differential Revision: https://reviews.llvm.org/D112038
2021-10-19 16:07:34 -07:00
Yuta Saito 1813fde9cc [WebAssembly] Emit clangast in custom section aligned by 4 bytes
Emit __clangast in custom section instead of named data segment
to find it while iterating sections.
This could be avoided if all data segements (the wasm sense) were
represented as their own sections (in the llvm sense).
This can be resolved by https://github.com/WebAssembly/tool-conventions/issues/138

And the on-disk hashtable in clangast needs to be aligned by 4 bytes,
so add paddings in name length field in custom section header.

The length of clangast section name can be represented in 1 byte
by leb128, and possible maximum pads are 3 bytes, so the section
name length won't be invalid in theory.

Fixes https://bugs.llvm.org/show_bug.cgi?id=35928

Differential Revision: https://reviews.llvm.org/D74531
2021-10-19 15:50:08 -07:00
Arthur Eubanks 9660563950 [llvm-reduce] Add reduction passes to reduce operands to undef/1/0
Having non-undef constants in a final llvm-reduce output is nicer than
having undefs.

This splits the existing reduce-operands pass into three, one which does
the same as the current pass of reducing to undef, and two more to
reduce to the constant 1 and the constant 0. Do not reduce to undef if
the operand is a ConstantData, and do not reduce 0s to 1s.

Reducing GEP operands very frequently causes invalid IR (since types may
not match up if we index differently into a struct), so don't touch GEPs.

Reviewed By: Meinersbur

Differential Revision: https://reviews.llvm.org/D111765
2021-10-19 15:25:21 -07:00
Fangrui Song 922bf57fc8 [Driver][Gnu] Delete unneeded -Bstatic dispatch for arm/thumb
Historically -static and -Bstatic are synonym.
gold made the semantics of -static slightly stronger but that does not matter.
2021-10-19 15:24:07 -07:00
Sanjay Patel 92a0389b04 [x86] add special-case lowering for usubsat for pre-SSE4
usubsat X, SMIN --> (X ^ SMIN) & (X s>> BW-1)

This would be a regression with D112085 where we combine to
usubsat more aggressively, so avoid that by matching the
special-case where we are subtracting SMIN (signmask):
https://alive2.llvm.org/ce/z/4_3gBD

Differential Revision: https://reviews.llvm.org/D112095
2021-10-19 17:13:16 -04:00
Keith Smiley 17386cb4dc [clang][Driver] Make multiarch output file basenames reproducible
When building a multiarch MachO binary, previously the intermediate
output file names would contain random characters. On macOS this
filename, since it's used when linking, ended up being used as a
stable-ish identifier for the adhoc codesignature of the binary, leading
to non-reproducible binaries. This change uses the architecture, when
available, to create a stable, but unique, basename for the file.

Differential Revision: https://reviews.llvm.org/D111269
2021-10-19 13:49:47 -07:00
Sanjay Patel e2faf721b2 [x86] add tests for psubus; NFC 2021-10-19 16:41:18 -04:00
Valentin Clement c983aeddcf
[fir] Add character utility functions in FIRBuilder
Extract part of D111337 in order to mke it smaller
and easier to review. This patch add some utility
functions to the FIRBuilder.

Add the following utility functions:
- getCharacterLengthType
- createStringLiteral
- locationToFilename
- characterWithDynamicLen
- sequenceWithNonConstantShape
- hasDynamicSize

These bring up the BoxValue implementation together with it.

This patch is part of the upstreaming effort from fir-dev branch.

Reviewed By: AlexisPerry

Differential Revision: https://reviews.llvm.org/D112074

Co-authored-by: Jean Perier <jperier@nvidia.com>
Co-authored-by: Eric Schweitz <eschweitz@nvidia.com>
2021-10-19 22:34:21 +02:00
Volodymyr Sapsai 91e19f66e5 [driver] Explicitly specify `-fbuild-session-timestamp` in seconds.
Representation of the file's last modification time depends on the file
system and isn't guaranteed to be in seconds. Cast to seconds explicitly
and tighten the test case to check the magnitude of the calculated
value, so we can catch passing milliseconds or nanoseconds.

rdar://83915615

Differential Revision: https://reviews.llvm.org/D111205
2021-10-19 13:30:26 -07:00
Vedant Kumar 5e004b03f7 [lldb/test] Update test/API/functionalities/load_lazy to macOS 12
In macOS 12, dyld switched to using chained fixups. As a result, all symbols
are bound at launch and there are no lazy pointers any more. Since we wish to
import/dlopen() a dylib with missing symbols, we need to use a weak import.
This applies to all macOS 12-aligned OS releases, e.g. iOS 15, etc.

rdar://81295101

Differential Revision: https://reviews.llvm.org/D112034
2021-10-19 13:25:14 -07:00
Michael Liao 6fe902daf9 [cuda] Add address space predicate funuctions.
- Add the missing NVVM predicate builtins on address space checking
- Redefine them as pure functions so that they could be used in
  __builtin_assume.

Reviewed By: tra

Differential Revision: https://reviews.llvm.org/D112053
2021-10-19 16:20:14 -04:00
Lawrence D'Anna 8ac5a6641f [lldb] improve the help strings for gdb-remote and kdp-remote
The help string can be more helpful by explaining these are
aliases for 'process connect'

Reviewed By: JDevlieghere

Differential Revision: https://reviews.llvm.org/D111965
2021-10-19 13:08:21 -07:00
Bjorn Pettersson 9c44a0996c [SCEV] Fix formatting error introduced by D112080
Accidentally pushed D112080 without this clang-format cleanup.
2021-10-19 21:44:07 +02:00
Zequan Wu 57553ce432 Revert "Reland [clang] Pass -clear-ast-before-backend in Clang::ConstructJob()"
This reverts commit 1fb24fe85a.

This causes clang crash on chromium. See repro at https://bugs.chromium.org/p/chromium/issues/detail?id=1261551#c1.
2021-10-19 12:39:34 -07:00
Bjorn Pettersson 08619006a0 [SCEV] Avoid compile time explosion in ScalarEvolution::isImpliedCond
As seen in PR51869 the ScalarEvolution::isImpliedCond function might
end up spending lots of time when doing the isKnownPredicate checks.

Calling isKnownPredicate for example result in isKnownViaInduction
being called, which might result in isLoopBackedgeGuardedByCond being
called, and then we might get one or more new calls to isImpliedCond.
Even if the scenario described here isn't an infinite loop, using
some random generated C programs as input indicates that those
isKnownPredicate checks quite often returns true. On the other hand,
the third condition that needs to be fulfilled in order to "prove
implications via truncation", i.e. the isImpliedCondBalancedTypes
check, is rarely fulfilled.
I also made some similar experiments to look at how often we would
get the same result when using isKnownViaNonRecursiveReasoning instead
of isKnownPredicate. So far I haven't seen a single case when codegen
is negatively impacted by using isKnownViaNonRecursiveReasoning. On
the other hand, it seems like we get rid of the compile time explosion
seen in PR51869 that way. Hence this patch.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D112080
2021-10-19 21:37:57 +02:00
Philip Reames 0836a1059d Extend transform introduced in D111896 to multiple exits
This is trivial.  It was left out of the original review only because we had multiple copies of the same code in review at the same time, and keeping them in sync was easiest if the structure was kept in sync.
2021-10-19 12:12:19 -07:00
Philip Reames fca0218875 [indvars] Canonicalize exit conditions to unsigned using range info
This patch duplicates a bit of logic we apply to comparisons encountered during the IV users walk to conditions which feed exit conditions. Why? simplifyAndExtend has a very limited list of users it walks. In particular, in the examples is stops at the zext and never visits the icmp. (Because we can't fold the zext to an addrec yet in SCEV.) Being willing to visit when we haven't simplified regresses multiple tests (seemingly because of less optimal results when computing trip counts).

Note that this can be trivially extended to multiple exiting blocks. I'm leaving that to a future patch (solely to cut down on the number of versions of the same code in review at once.)

Differential Revision: https://reviews.llvm.org/D111896
2021-10-19 11:49:12 -07:00
Craig Topper dc8a5f9419 [RISCV] Use llvm::stable_sort instead of std::stable_sort. NFC 2021-10-19 11:37:40 -07:00
Anna Thomas 9403514e76 [LoopPredication] Calculate profitability without BPI
Using BPI within loop predication is non-trivial because BPI is only
preserved lossily in loop pass manager (one fix exposed by lossy
preservation is up for review at D111448). However, since loop
predication is only used in downstream pipelines, it is hard to keep BPI
from breaking for incomplete state with upstream changes in BPI.
Also, correctly preserving BPI for all loop passes is a non-trivial
undertaking (D110438 does this lossily), while the benefit of using it
in loop predication isn't clear.

In this patch, we rely on profile metadata to get almost similar benefit as
BPI, without actually using the complete heuristics provided by BPI.
This avoids the compile time explosion we tried to fix with D110438 and
also avoids fragile bugs because BPI can be lossy in loop passes
(D111448).

Reviewed-By: asbirlea, apilipenko
Differential Revision: https://reviews.llvm.org/D111668
2021-10-19 14:24:04 -04:00
Joe Loser 622c40722e
[libc++] Make __weekday_from_days private in weekday
`weekday` has a static member function `__weekday_from_days` which is
not part of the mandated public interface of `weeekday` according to the
standard. Since it is only used internally in the constructors of
`weekday`, let's make it private.

Reviewed By: ldionne, Mordante, #libc

Differential Revision: https://reviews.llvm.org/D112072
2021-10-19 14:21:33 -04:00
Joe Loser 494dad6b72
[libc++][NFC] Mark LWG3573 as complete
Mark LWG3573 as complete. It involves a change in wording around when
`basic_string_view`'s constructor for iterator/sentinel can throw. The
current implementation is not marked conditionally `noexcept`, so there
is nothing to do here. Add a test that binds this behavior to verify the
constructor is not marked `noexcept(true)` when `end - begin` throws.

Reviewed By: ldionne, Mordante, #libc

Differential Revision: https://reviews.llvm.org/D111925
2021-10-19 14:18:49 -04:00
Louis Dionne a039746e1c [runtimes] Trigger CI on changes to libunwind 2021-10-19 13:16:42 -04:00
Mehdi Amini e2f16be599 Fix clang-tidy warnings in MLIR Python bindings (NFC) 2021-10-19 17:15:20 +00:00
Sanjay Patel c1ca9e3077 [AMDGPU] add test for usubsat; NFC 2021-10-19 13:05:23 -04:00
Sanjay Patel 081bad1d4d [x86] add tests for psubus; NFC 2021-10-19 13:05:23 -04:00
Konstantin Varlamov b84da5ba6e [libc++] [test] Add tests for converting array types in shared_ptr.
The only possible kind of a conversion in initialization of a shared
pointer to an array is a qualification conversion (i.e., adding
cv-qualifiers). This patch adds tests for converting from `A[]` to
`const A[]` to the following functions:

```
template<class Y> explicit shared_ptr(Y* p);

template<class Y> shared_ptr(const shared_ptr<Y>& r);
template<class Y> shared_ptr(shared_ptr<Y>&& r);

template<class Y> shared_ptr& operator=(const shared_ptr<Y>& r);
template<class Y> shared_ptr& operator=(shared_ptr<Y>&& r);

template<class Y> void reset(Y* p);
template<class Y, class D> void reset(Y* p, D d);
template<class Y, class D, class A> void reset(Y* p, D d, A a);
```

Similar tests for converting functions that involve a `weak_ptr` should
be added once LWG issue [3001](https://cplusplus.github.io/LWG/issue3001)
is implemented.

Differential Revision: https://reviews.llvm.org/D112048
2021-10-19 13:03:51 -04:00
Jim Ingham a66798cd67 Remove unneeded variable num_found. 2021-10-19 09:57:07 -07:00
Jonas Devlieghere 1529738b66 [debugserver] Fix BUILDING_FOR_ARM64_OSX
Check for TARGET_CPU_ARM64 (ARM instructions for 64-bit mode) rather
than TARGET_CPU_ARM (instructions for 32-bit mode).
2021-10-19 09:55:53 -07:00
Arthur Eubanks ac0561ebb7 [Verifier] Add context for assume operand bundles verifier errors
And fix a typo.
2021-10-19 09:52:04 -07:00
Carlos Galvez 7812cb72a3 Use reference type in for loop
To fix failing build job.
2021-10-19 16:37:56 +00:00
Carlos Galvez bf6b0d1674 [clang-tidy] Support globbing in NOLINT* expressions
To simplify suppressing warnings (for example, for
when multiple check aliases are enabled).

The globbing format reuses the same code as for
globbing when enabling checks, so the semantics
and behavior is identical.

Differential Revision: https://reviews.llvm.org/D111208
2021-10-19 16:30:51 +00:00
Joseph Huber b1ce454930 [OpenMP] Remove macro guards for device debugging
The plugin currently uses a macro to check if this is a debug built
before assigning the debug kind variable to the device environment
struct. This is being deprecated because the new device runtime does not
maintain separate debug builds and should always be availible.

Reviewed By: tianshilei1992

Differential Revision: https://reviews.llvm.org/D112083
2021-10-19 12:21:43 -04:00
Louis Dionne 6fd55bba61 [libunwind] Add a from-scratch config for running libunwind tests
Running tests for libunwind is a lot simpler than running tests for
libc++, so a simple Lit config file is sufficient. The benefit is that
we disentangle the libunwind test configuration from the libc++ and
libc++abi test configuration. The setup was too complicated, which led
to some bugs (notably we were running against the system libunwind on
Apple platforms).

Differential Revision: https://reviews.llvm.org/D111664
2021-10-19 12:03:58 -04:00
Kazu Hirata cf68e1b2fb [Driver, Frontend] Use StringRef::contains (NFC) 2021-10-19 08:54:02 -07:00
Michał Górny b492b0be95 [lldb] [Process/Utility] Define dN regs on ARM via helper macro
Use FPU_REG macro to define dN registers, removing the wrong value_regs
while at it.  This is a piece-wise attempt of reconstructing D112066
with the goal of figuring out which part of the larger change breaks
the buildbot.

Differential Revision: https://reviews.llvm.org/D112066
2021-10-19 17:06:03 +02:00
Jamie Schmeiser 3af474c0a1 Changes to print-changed classes in preparation for DotCfg change printer
Summary:
Break out non-functional changes to the print-changed classes that are needed
for reuse with the DotCfg change printer in https://reviews.llvm.org/D87202.

Various changes to the change printers to facilitate reuse with the
upcoming DotCfg change printer. This includes changing several of
the classes and their support classes to being templates. Also,
some template parameter names were simplified to avoid confusion
with planned identifiers in the DotCfg change printer to come. A
virtual function in the class for comparing functions was changed
to a lambda. The virtual function same was replaced with calls to
operator==. The only intentional functional change was to add the exe name
as the first parameter to llvm::sys::ExecuteAndWait

Author: Jamie Schmeiser <schmeise@ca.ibm.com>
Reviewed By: aeubanks (Arthur Eubanks)
Differential Revision: https://reviews.llvm.org/D110737
2021-10-19 10:58:40 -04:00
David Sherwood 5ea35791e6 [AArch64] Split out processor/tuning features
Following on from an earlier patch that introduced support for -mtune
for AArch64 backends, this patch splits out the tuning features
from the processor features. This gives us the ability to enable
architectural feature set A for a given processor with "-mcpu=A"
and define the set of tuning features B with "-mtune=B".

It's quite difficult to write a test that proves we select the
right features according to the tuning attribute because most
of these relate to scheduling. I have created a test here:

  CodeGen/AArch64/misched-fusion-addr-tune.ll

that demonstrates the different scheduling choices based upon
the tuning.

Differential Revision: https://reviews.llvm.org/D111551
2021-10-19 15:18:55 +01:00
David Sherwood 23db763b7d Fix documentation errors introduced by 607fb1bb8c 2021-10-19 15:12:03 +01:00
Shraiysh Vaishay 10e08784ca [MLIR][OpenMP][NFC] Moved Synchronization Hint related functions
The functions are moved above the parseClauses function as they
will be used inside it to parse `hint` clause

Reviewed By: clementval

Differential Revision: https://reviews.llvm.org/D112071
2021-10-19 19:38:31 +05:30
Amy Kwan 5eaf5b9161 [PowerPC] Restrict various P10 options to P10 only.
This patch attempts to restrict the following P10 options:
```
-mprefixed
-mpcrel
-mpaired-vector-memops
```
To P10 only. This will prevent the use of these options on P9 and earlier.

The behaviour of this patch looks like the following on pre-P10:
```
$ clang -mcpu=pwr9 -mpaired-vector-memops test.c -o test
error: option '-mpaired-vector-memops' cannot be specified without '-mcpu=pwr10'
$ clang -mcpu=pwr9 -mprefixed test.c -o test
error: option '-mprefixed' cannot be specified without '-mcpu=pwr10'
$ clang -mcpu=pwr9 -mprefixed -mpcrel test.c -o test
error: option '-mpcrel' cannot be specified without '-mcpu=pwr10 -mprefixed'
$ clang -mcpu=pwr9 -mpcrel -mprefixed test.c -o test
error: option '-mpcrel' cannot be specified without '-mcpu=pwr10 -mprefixed'
$ clang -mcpu=pwr9 -mpcrel test.c -o test
error: option '-mpcrel' cannot be specified without '-mcpu=pwr10 -mprefixed'
```

Differential Revision: https://reviews.llvm.org/D109652
2021-10-19 09:01:01 -05:00
David Sherwood 607fb1bb8c [AArch64] Always add -tune-cpu argument to -cc1 driver
This patch ensures that we always tune for a given CPU on AArch64
targets when the user specifies the "-mtune=xyz" flag. In the
AArch64Subtarget if the tune flag is unset we use the CPU value
instead.

I've updated the release notes here:

  llvm/docs/ReleaseNotes.rst

and added tests here:

  clang/test/Driver/aarch64-mtune.c

Differential Revision: https://reviews.llvm.org/D110258
2021-10-19 14:57:51 +01:00
Joe Loser ca889733a2
[libc++][docs] Mark LWG3420 complete
Mark LWG3420 as complete. Currently, the `cpp17_iterator` concept
checks that the type looks like an iterator first before checking if it
is copyable.

Reviewed By: ldionne, Quuxplusone, #libc

Differential Revision: https://reviews.llvm.org/D111598
2021-10-19 09:52:35 -04:00
Michał Górny 28e0c34216 [lldb] [Process/Utility] Define sN regs on ARM via helper macro
This is a piece-wise attempt of reconstructing D112066 with the goal
of figuring out which part of the larger change breaks the buildbot.

Differential Revision: https://reviews.llvm.org/D112066
2021-10-19 15:51:47 +02:00
Michał Górny 5cd28f71b1 [lldb] [Process/Utility] clang-format RegisterInfos_arm.h 2021-10-19 15:51:47 +02:00
Simon Pilgrim 71e39e3f18 [ADT] Add APInt::isNegatedPowerOf2() helper
Inspired by D111968, provide a isNegatedPowerOf2() wrapper instead of obfuscating code with (-Value).isPowerOf2() patterns, which I'm sure are likely avenues for typos.....

Differential Revision: https://reviews.llvm.org/D111998
2021-10-19 14:38:21 +01:00