Commit Graph

395763 Commits

Author SHA1 Message Date
Fangrui Song 43a43353f7 [gn build] (manually) port ee7d20e846 2021-08-04 10:28:27 -07:00
Jessica Paquette d9279843b1 [AArch64][GlobalISel] Widen G_PHI before clamping it during legalization
This allows us to handle weird types like s88; we first widen to s128, then
clamp back down to s64.

https://godbolt.org/z/9xqbP46Mz

Also this makes it possible for GISel to legalize the case in pr48188.ll. It
now does the same thing as SDAG, although regalloc chooses different registers.

Differential Revision: https://reviews.llvm.org/D107417
2021-08-04 10:25:14 -07:00
Jessica Paquette 7d97de60b3 [AArch64][GlobalISel] Widen G_FPTO*I before clamping
Going through our legalization rules and doing some cleanup.

Widening and then clamping is usually easier than clamping and then widening.

This allows us to legalize some weird types like s88.

Differential Revision: https://reviews.llvm.org/D107413
2021-08-04 10:19:26 -07:00
Petr Hosek 6660cec568 [InstrProfiling] Emit bias variable eagerly
Rather than emitting the bias variable lazily as needed, emit it
eagerly. This allows profile runtime to refer to this variable
unconditionally without having to use the weak reference. The bias
variable is in a COMDAT so there'll never be more than one instance,
and if it's not needed, linker should be able to GC it, so the overhead
should be minimal.

Differential Revision: https://reviews.llvm.org/D107377
2021-08-04 10:17:08 -07:00
Geoffrey Martin-Noble b0d58ddf87 [Bazel] Update build for ee7d20e846
Updates the Bazel configuration for
https://github.com/llvm/llvm-project/commit/ee7d20e84675. We need to
drop the dependency from llvm-tblgen to avoid a dependency cycle:

```
.-> @llvm-project//llvm:llvm-tblgen
|   @llvm-project//llvm:tblgen
|   @llvm-project//llvm:MC
|   @llvm-project//llvm:ProfileData
|   @llvm-project//llvm:Core
|   @llvm-project//llvm:attributes_gen
|   @llvm-project//llvm:include/llvm/IR/Attributes.inc
|   @llvm-project//llvm:attributes_gen__gen_attrs_genrule
`-- @llvm-project//llvm:llvm-tblgen
```

It appears this dep was not strictly necessary though. TableGen uses MC
headers but it can get those through Support, which also exports MC
headers due to layering issues.

Differential Revision: https://reviews.llvm.org/D107480
2021-08-04 10:08:57 -07:00
Andrea Di Biagio 7a1a35a1d1 [X86][SchedModel] Add missing ReadAdvance for some arithmetic ops (PR51318 and PR51322).
This fixes a bug where implicit uses of EFLAGS were not marked as ReadAdvance in
the RM/MR variants of ADC/SBB (PR51318)

This also fixes the absence of ReadAdvance for the register operand of
RMW arithmetic instructions (PR51322).

Differential Revision: https://reviews.llvm.org/D107367
2021-08-04 17:50:22 +01:00
Shilei Tian 680c71b127 [OpenMP] Clean up for hidden helper task
This patch makes some clean up for code of hidden helper task.

Reviewed By: protze.joachim

Differential Revision: https://reviews.llvm.org/D107008
2021-08-04 12:36:44 -04:00
Shilei Tian 9f5d6ea52e [OpenMP] Fix performance regression reported in bug #51235
This patch fixes the "performance regression" reported in https://bugs.llvm.org/show_bug.cgi?id=51235. In fact it has nothing to do with performance. The root cause is, the stolen task is not allowed to execute by another thread because by default it is tied task. Since hidden helper task will always be executed by hidden helper threads, it should be untied.

Reviewed By: protze.joachim

Differential Revision: https://reviews.llvm.org/D107121
2021-08-04 12:34:49 -04:00
Fangrui Song 0a6aad5991 [ELF] Fix typo. NFC 2021-08-04 09:26:29 -07:00
jamesluox ee7d20e846 [CSSPGO] Migrate and refactor the decoder of Pseudo Probe
Migrate pseudo probe decoding logic in llvm-profgen to MC, so other LLVM-base program could reuse existing codes. Redesign object layout of encoded and decoded pseudo probes.

Reviewed By: hoy

Differential Revision: https://reviews.llvm.org/D106861
2021-08-04 09:21:34 -07:00
Sander de Smalen fe6ae81ef3 [InstCombine] Fix vscale zext/sext optimization when vscale_range is unbounded.
According to the LangRef, a (vscale_range) value of 0 means unbounded.

This patch additionally cleans up the test file vscale_sext_and_zext.ll.
2021-08-04 17:17:37 +01:00
Bradley Smith e57e1e4e00 [clang][AArch64][SVE] Avoid going through memory for fixed/scalable predicate casts
For fixed SVE types, predicates are represented using vectors of i8,
where as for scalable types they are represented using vectors of i1. We
can avoid going through memory for casts between these by bitcasting the
i1 scalable vectors to/from a scalable i8 vector of matching size, which
can then use the existing vector insert/extract logic.

Differential Revision: https://reviews.llvm.org/D106860
2021-08-04 16:10:37 +00:00
Fangrui Song 66d4430492 [ELF] Combine foo@v1 and foo with the same versionId if both are defined
Due to an assembler design flaw (IMO), `.symver foo,foo@v1` produces two symbols `foo` and `foo@v1` if `foo` is defined.

* `v1 {};` produces both `foo` and `foo@v1`, but GNU ld only produces `foo@v1`
* `v1 { foo; };` produces both `foo@@v1` and `foo@v1`, but GNU ld only produces `foo@v1`
* `v2 { foo; };` produces both `foo@@v2` and `foo@v1`, matching GNU ld. (Tested by symver.s)

This patch implements the GNU ld behavior by reusing the symbol redirection mechanism
in D92259. The new test symver-non-default.s checks the first two cases.

Without the patch, the second case will produce `foo@v1` and `foo@@v1` which
looks weird and makes foo unnecessarily default versioned.

Note: `.symver foo,foo@v1,remove` exists but the unfortunate `foo` will not go
away anytime soon.

Reviewed By: peter.smith

Differential Revision: https://reviews.llvm.org/D107235
2021-08-04 09:06:05 -07:00
Dmitry Vyukov bdeb15c34e tsan: remove non-existent MemoryAccessRangeStep
Probably was used for Go at some point...

Depends on D107466.

Reviewed By: melver

Differential Revision: https://reviews.llvm.org/D107467
2021-08-04 18:04:06 +02:00
Dmitry Vyukov c2598be8bc tsan: move AccessType to tsan_defs.h
It will be needed in more functions like ReportRace
(the plan is to pass it through MemoryAccess to ReportRace)
and this move will allow to split the huge tsan_rtl.h into parts
(e.g. move FastState/Shadow definitions to a separate header).

Depends on D107465.

Reviewed By: melver

Differential Revision: https://reviews.llvm.org/D107466
2021-08-04 18:03:56 +02:00
Dmitry Vyukov 2ddaffdc74 tsan: introduce kAccessExternalPC
Add kAccessExternal memory access flag that denotes
memory accesses with PCs that may have kExternalPCBit set.
In preparation for MemoryAccess refactoring.
Currently unused, but will allow to skip a branch.

Depends on D107464.

Reviewed By: melver

Differential Revision: https://reviews.llvm.org/D107465
2021-08-04 18:03:49 +02:00
Dmitry Vyukov d41233e9cf tsan: introduce kAccessFree
Add kAccessFree memory access flag (similar to kAccessVptr).
In preparation for MemoryAccess refactoring.

Reviewed By: melver

Differential Revision: https://reviews.llvm.org/D107464
2021-08-04 18:03:41 +02:00
Fangrui Song 7ed22a6fa9 [ELF] Apply version script patterns to non-default version symbols
Currently version script patterns are ignored for .symver produced
non-default version (single @) symbols. This makes such symbols
not localizable by `local:`, e.g.

```
.symver foo3_v1,foo3@v1
.globl foo_v1
foo3_v1:

ld.lld --version-script=a.ver -shared a.o
# In a.out, foo3@v1 is incorrectly exported.
```

This patch adds the support:

* Move `config->versionDefinitions[VER_NDX_LOCAL].patterns` to `config->versionDefinitions[versionId].localPatterns`
* Rename `config->versionDefinitions[versionId].patterns` to `config->versionDefinitions[versionId].nonLocalPatterns`
* Allow `findAllByVersion` to find non-default version symbols when `includeNonDefault` is true. (Note: `symtab` keys do not have `@@`)
* Make each pattern check both the unversioned `pat.name` and the versioned `${pat.name}@${v.name}`
* `localPatterns` can localize `${pat.name}@${v.name}`. `nonLocalPatterns` can prevent localization by assigning `verdefIndex` (before `parseSymbolVersion`).

---

If a user notices new `undefined symbol` errors with a version script containing
`local: *;`, the issue is likely due to a missing `global:` pattern.

Reviewed By: peter.smith

Differential Revision: https://reviews.llvm.org/D107234
2021-08-04 09:02:11 -07:00
Lechen Yu 3bc8ce5dd7 [openmp] Add OMPT initialization in libomptarget
When loading libomptarget, the init function in libomptarget/src/rtl.cpp
will search for the libomptarget_start_tool function using libdl.
libomptarget_start_tool will pass those OMPT callbacks related to target
constructs to libomptarget

Differential Revision: https://reviews.llvm.org/D99803
2021-08-04 18:00:11 +02:00
Fangrui Song 9bd29a73d1 [ELF] Make dot in .tbss correct
GNU ld doesn't support multiple SHF_TLS SHT_NOBITS output sections (it restores
the address after an SHF_TLS SHT_NOBITS section, so consecutive SHF_TLS
SHT_NOBITS sections will have conflicting address ranges).

That said, `threadBssOffset` implements limited support for consecutive SHF_TLS
SHT_NOBITS sections. (SHF_TLS SHT_PROGBITS following a SHF_TLS SHT_NOBITS can still be
incorrect.)

`.` in an output section description of an SHF_TLS SHT_NOBITS section is
incorrect. (https://lists.llvm.org/pipermail/llvm-dev/2021-July/151974.html)

This patch saves the end address of the previous tbss section in
`ctx->tbssAddr`, changes `dot` in the beginning of `assignOffset` so
that `.` evaluation will be correct.

Reviewed By: peter.smith

Differential Revision: https://reviews.llvm.org/D107208
2021-08-04 08:58:50 -07:00
Aart Bik b4a1eab941 [mlir][sparse] fixed typo in sparse tensor type attribute alias
Reviewed By: grosul1, rriddle

Differential Revision: https://reviews.llvm.org/D107472
2021-08-04 08:57:52 -07:00
Bradley Smith d9cc5d84e4 [AArch64][SVE] Combine bitcasts of predicate types with vector inserts/extracts of loads/stores
An insert subvector that is inserting the result of a vector predicate
sized load into undef at index 0, whose result is casted to a predicate
type, can be combined into a direct predicate load. Likewise the same
applies to extract subvector but in reverse.

The purpose of this optimization is to clean up cases that will be
introduced in a later patch where casts to/from predicate types from i8
types will use insert subvector, rather than going through memory early.

This optimization is done in SVEIntrinsicOpts rather than InstCombine to
re-introduce scalable loads as late as possible, to give other
optimizations the best chance possible to do a good job.

Differential Revision: https://reviews.llvm.org/D106549
2021-08-04 15:51:14 +00:00
Aart Bik 478c71bf95 [mlir][amx] add doc to AMX dialect
Making sure the AMX dialect webpage reads better with a short introduction on the purpose of this dialect.

Reviewed By: grosul1, bondhugula

Differential Revision: https://reviews.llvm.org/D107419
2021-08-04 08:46:30 -07:00
Pushpinder Singh f3eb5f900d [AMDGPU][OpenMP] Wrap amdgcn declare variant inside ifdef
This fixes the issue https://bugs.llvm.org/show_bug.cgi?id=51337

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D107468
2021-08-04 15:24:46 +00:00
Simon Wallis 9269752671 [AArch64] Fix assert AArch64TargetLowering::ReplaceNodeResults
Don't know how to custom expand this
UNREACHABLE executed at llvm-project/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:16788

The fix is to provide missing expansions for:
  case ISD::STRICT_FP_TO_UINT:
  case ISD::STRICT_FP_TO_SINT:

A test case is provided.

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D107452
2021-08-04 16:18:19 +01:00
Sean Fertile b8f612e780 [PowerPC][AIX] Packed zero-width bitfields do not affect alignment.
Zero-width bitfields on AIX pad out to the natral alignment boundary but
do not change the containing records alignment.

Differential Revision: https://reviews.llvm.org/D106900
2021-08-04 11:03:25 -04:00
Jay Foad ba5c4ac600 [AMDGPU] Add cttz tests and globalisel checks for ctlz 2021-08-04 15:57:14 +01:00
Chris Jackson 21ee38e24f [DebugInfo][LSR] Avoid crashes on large integer inputs
SCEV-based salvaging in LSR translates SCEVs to DIExpressions. SCEVs may
contain very large integers but the translation does not support
integers greater than 64 bits. This patch adds checks to ensure
conversions of these large integers is not attempted. A regression test
is added to ensure no such translation is attempted.

Reviewed by: StephenTozer

PR: https://bugs.llvm.org/show_bug.cgi?id=51329

Differential Revision: https://reviews.llvm.org/D107438
2021-08-04 15:51:22 +01:00
Jay Foad 027d3b747e [AMDGPU] Generate checks for i64 to fp conversions
Differential Revision: https://reviews.llvm.org/D107429
2021-08-04 15:39:46 +01:00
Kazu Hirata 785f37b207 [ADT] Drop unnecessary const from return types (NFC)
Identified with const-return-type-APInt.
2021-08-04 07:38:24 -07:00
Reshabh Sharma d42e70b3d3 [AMDGPU] Handle functions in llvm's global ctors and dtors list
This patch introduces a new code object metadata field, ".kind"
which is used to add support for init and fini kernels.

HSAStreamer will use function attributes, "device-init" and
"device-fini" to distinguish between init and fini kernels from
the regular kernels and will emit metadata with ".kind" set to
"init" and "fini" respectively.

To reduce the number of init and fini kernels, the ctors and
dtors present in the llvm's global.ctors and global.dtors lists
are called from a single init and fini kernel respectively.

Reviewed by: yaxunl

Differential Revision: https://reviews.llvm.org/D105682
2021-08-04 19:53:33 +05:30
Roman Lebedev 35c0848b57
[NFC][X86] combineX86ShuffleChain(): hoist Mask variable higher up
Having `NewMask` outside of an if and rebinding `BaseMask` `ArrayRef`
to it is confusing. Instead, just move the `Mask` vector higher up,
and change the code that earlier had no access to it but now does
to use `Mask` instead of `BaseMask`.

This has no other intentional changes.
2021-08-04 17:15:12 +03:00
Roman Lebedev 916cdc3d4b
[NFC][X86] combineX86ShuffleChain(): rename inner Mask to avoid future shadowing
I want to hoist `Mask` variable higher up,
but then it would clash with this one.
So let's rename this one first.

There are no other intentional changes here other than said rename.
2021-08-04 17:15:12 +03:00
Tomas Matheson 40650f27b5 [ARM][atomicrmw] Fix CMP_SWAP_32 expand assert
This assert is intended to ensure that the high registers are not
selected when it is passed to one of the thumb UXT instructions. However
it was triggering even for 32 bit where no UXT instruction is emitted.

Fixes PR51313.

Differential Revision: https://reviews.llvm.org/D107363
2021-08-04 15:02:02 +01:00
Roman Lebedev f819e4c7d0
[X86] combineX86ShuffleChain(): canonicalize mask elts picking from splats
Given a shuffle mask, if it is picking from an input that is splat
given the current granularity of the shuffle, then adjust the mask
to pick from the same lane of the input as the mask element is in.
This may result in a shuffle being simplified into a blend.

I believe this is correct given that the splat detection matches the one
just above the new code,

My basic thought is that we might be able to get less regressions
by handling multiple insertions of the same value into a vector
if we form broadcasts+blend here, as opposed to D105390,
but i have not really thought this through,
and did not try implementing it yet.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D107009
2021-08-04 16:55:04 +03:00
Matthias Springer 438f700b4d [mlir] Fix gcc-5 build in ViewOpGraph.cpp
Differential Revision: https://reviews.llvm.org/D107458
2021-08-04 22:49:34 +09:00
Matthias Springer b44eb5a149 [flang] Add missing FileSystem.h
This file was previously included transitively via `mlir/Transforms/Passes.h`, but the include has been removed from that file.

Differential Revision: https://reviews.llvm.org/D107455
2021-08-04 22:22:26 +09:00
David Green eeddcba525 [RDA] Attempt to make RDA subreg aware
This attempts to make more of RDA aware of potentially overlapping
subregisters. Some of this was already in place, with it iterating
through MCRegUnitIterators. This also replaces calls to
LiveRegs.contains(..) with !LiveRegs.available(..), and updates the
isValidRegUseOf and isValidRegDefOf to search subregs.

Differential Revision: https://reviews.llvm.org/D107351
2021-08-04 14:21:32 +01:00
David Green ff9958b70e [ARM] Test showing incorrect codegen when subreg liveness is enabled. NFC 2021-08-04 14:21:32 +01:00
Matthias Springer b6408fa169 [mlir] Include llvm/Support/Debug.h in Transforms/Passes.h
There are many downstream users of llvm::dbgs, which is defined in Debug.h. Before D106342, many users included that dependency transitively via the now deleted ViewRegionGraph.h. Adding it back to Transforms/Passes.h for convenience.

Differential Revision: https://reviews.llvm.org/D107451
2021-08-04 21:45:28 +09:00
Simon Pilgrim 8cd40ece70 [X86] Rename X86 tuning feature flag FeatureHasFastGather -> FeatureFastGather
Match the naming style used by the other 'FeatureFast/FeatureSlow' tuning flags.
2021-08-04 13:07:50 +01:00
Simon Pilgrim 17e8ac0703 [X86] Move FeatureFastBEXTR from bdver2 features to tuning
Noticed while looking at the feature flag renaming suggested in D107370
2021-08-04 13:07:49 +01:00
Muhammad Omair Javaid f2128abec2 [LLDB] Skip flaky tests on Arm/AArch64 Linux bots
Following LLDB tests fail randomly on LLDB Arm/AArch64 Linux buildbots.
We still not have a reliable solution for these tests to pass
consistently. I am marking them skipped for now.

TestBreakpointCallbackCommandSource.py
TestIOHandlerResize.py
TestEditline.py
TestGuiViewLarge.py
TestGuiExpandThreadsTree.py
TestGuiBreakpoints.py
2021-08-04 16:57:36 +05:00
Dmitry Vyukov e3f4c63e78 tsan: don't use spinning in __cxa_guard_acquire/pthread_once
Currently we use passive spinning with internal_sched_yield to wait
in __cxa_guard_acquire/pthread_once. Passive spinning tends to degrade
ungracefully under high load. Use FutexWait/Wake instead.

Depends on D107359.

Reviewed By: vitalybuka, melver

Differential Revision: https://reviews.llvm.org/D107360
2021-08-04 13:56:33 +02:00
Jan Svoboda 2718ae397b [clang][deps] Substitute clang-scan-deps executable in lit tests
The lit tests for `clang-scan-deps` invoke the tool without going through the substitution system. While the test runner correctly picks up the `clang-scan-deps` binary from the build directory, it doesn't print its absolute path. When copying the invocations when reproducing test failures, this can result in `command not found: clang-scan-deps` errors or worse yet: pick up the system `clang-scan-deps`. This patch adds new local `%clang-scan-deps` substitution.

Reviewed By: lxfind, dblaikie

Differential Revision: https://reviews.llvm.org/D107155
2021-08-04 13:55:14 +02:00
Dmitry Vyukov 0bc626d516 tsan: refactor guard_acquire/release
Introduce named consts for magic values we use.

Differential Revision: https://reviews.llvm.org/D107445
2021-08-04 13:52:27 +02:00
Jan Svoboda 0556138624 [clang][cli] Expose -fno-cxx-modules in cc1
For some use-cases, it might be useful to be able to turn off modules for C++ in `-cc1`. (The feature is implied by `-std=C++20`.)

This patch exposes the `-fno-cxx-modules` option in `-cc1`.

Reviewed By: arphaman

Differential Revision: https://reviews.llvm.org/D106864
2021-08-04 13:46:40 +02:00
Matthias Springer 9102a16bef [mlir] Support drawing control-flow graphs in ViewOpGraph.cpp
* Add new pass option `print-data-flow-edges`, default value `true`.
* Add new pass option `print-control-flow-edges`, default value `false`.
* Remove `PrintCFGPass`. Same functionality now provided by
  `PrintOpPass`.

Differential Revision: https://reviews.llvm.org/D106342
2021-08-04 20:45:15 +09:00
Dmitry Vyukov 636428c727 tsan: unify __cxa_guard_acquire and pthread_once implementations
Currently we effectively duplicate "once" logic for __cxa_guard_acquire
and pthread_once. Unify the implementations.

This is not a no-op change:
 - constants used for pthread_once are changed to match __cxa_guard_acquire
   (__cxa_guard_acquire constants are tied to ABI, but it does not seem
   to be the case for pthread_once)
 - pthread_once now also uses PotentiallyBlockingRegion annotations
 - __cxa_guard_acquire checks thr->in_ignored_lib to skip user synchronization
It's unclear if these 2 differences are intentional or a mere sloppy inconsistency.
Since all tests still pass, let's assume the latter.

Reviewed By: vitalybuka, melver

Differential Revision: https://reviews.llvm.org/D107359
2021-08-04 13:44:19 +02:00
Dmitry Vyukov 14e306fa4b tsan: use DCHECK instead of CHECK in atomic functions
Atomic functions are semi-hot in profiles.
The CHECKs verify values passed by compiler
and they never fired, so replace them with DCHECKs.

Reviewed By: vitalybuka, melver

Differential Revision: https://reviews.llvm.org/D107373
2021-08-04 13:23:57 +02:00