Commit Graph

388586 Commits

Author SHA1 Message Date
Heejin Ahn 8e35a18e4a [WebAssembly] Support Emscripten EH/SjLj in Wasm64
In wasm64, the signatures of some library functions and global variables
defined in Emscripten change:
- `emscripten_longjmp`: `(i32, i32) -> ()` -> `(i64, i32) -> ()`
  This changes because the first argument is the address of a memory
  buffer. This in turn causes more changes below.
- `setThrew`: `(i32, i32) -> ()` -> `(i64, i32) -> ()`
  `emscripten_longjmp` calls `setThrew` with the i64 buffer argument as
  the first parameter.
- `__THREW__` (global var): `i32` to `i64`
  `setThrew`'s first argument is set to this `__THREW__` variable, so it
  should change to i64 as well.
- `testSetjmp`: `(i32, i32*, i32) -> (i32)` -> `(i64, i32*, i32) -> (i32)`
  In the code transformation done in this pass, the value of `__THREW__`
  is passed as the first parameter of `testSetjmp`.

This patch creates some helper functions to easily get types that become
different depending on the wasm32/wasm64, and uses them to change
various function signatures and code transformations. Also updates the
tests with WASM32/WASM64 check lines.

(Untested) Emscripten side patch: https://github.com/emscripten-core/emscripten/pull/14108

Reviewed By: aardappel

Differential Revision: https://reviews.llvm.org/D101985
2021-05-14 03:45:09 -07:00
Tim Northover ea0eec69f1 IR+AArch64: add a "swiftasync" argument attribute.
This extends any frame record created in the function to include that
parameter, passed in X22.

The new record looks like [X22, FP, LR] in memory, and FP is stored with 0b0001
in bits 63:60 (CodeGen assumes they are 0b0000 in normal operation). The effect
of this is that tools walking the stack should expect to see one of three
values there:

  * 0b0000 => a normal, non-extended record with just [FP, LR]
  * 0b0001 => the extended record [X22, FP, LR]
  * 0b1111 => kernel space, and a non-extended record.

All other values are currently reserved.

If compiling for arm64e this context pointer is address-discriminated with the
discriminator 0xc31a and the DB (process-specific) key.

There is also an "i8** @llvm.swift.async.context.addr()" intrinsic providing
front-ends access to this slot (and forcing its creation initialized to nullptr
if necessary).
2021-05-14 11:43:58 +01:00
Simon Pilgrim 079bbea2b2 [Local] collectBitParts - for bswap-only matches, limit shift amounts to whole bytes to reduce compile time. 2021-05-14 11:42:52 +01:00
Simon Pilgrim 78c8451cd7 [Local] collectBitParts - reduce maximum recursion depth.
As noticed on D90170, the recursion depth for matching a maximum of a i128 bitwidth was too high.

@lebedev.ri mentioned that we can probably do better by limiting the number of collected Values instead of just depth, but I'll look at that later.
2021-05-14 11:42:51 +01:00
Florian Hahn 7ba0e99aec
[VectorCombine] Add tests with assumes involvind variable index.
Add test cases with variable indices together with assumes guaranteeing
that the indices are valid.
2021-05-14 11:20:08 +01:00
Anton Afanasyev 207cdd7ed9 [SLP] Fix spill cost computation for insertelement tree node
This is follow up for D98714, bugfixing.
2021-05-14 13:14:41 +03:00
Simon Pilgrim 5ed56a821c [X86] Try to pass DebugLoc by const-ref to avoid costly TrackingMDNodeRef copies. NFCI. 2021-05-14 11:14:18 +01:00
Max Kazantsev e51ef7f070 [Test] Add test on missing opportunity in Loop Deletion
We can break the backedge in some cases when we can evaluate some of the
values and conditions on the 1st iteration.
2021-05-14 16:57:50 +07:00
Tim Northover 4789fc75d3 AArch64: support i128 cmpxchg in GlobalISel.
There are three essentially different cases to handle:

  * -O1, no LSE. The IR is expanded to ldxp/stxp and we need patterns to select
    them.
  * -O0, no LSE. We get G_ATOMIC_CMPXCHG, and need to produce CMP_SWAP_N
    pseudos. The registers are all 64-bit so this is easy.
  * LSE. We get G_ATOMIC_CMPXCHG and need to produce a CASP instruction with
    XSeqPair registers.

The last case is by far the hardest, and and adds 128-bit GPR support as a
byproduct.
2021-05-14 10:41:38 +01:00
Sander de Smalen 459c48e04f NFCI: Remove VF argument from isScalarWithPredication
As discussed in D102437, the VF argument to isScalarWithPredication
seems redundant, so this is intended to be a non-functional change. It
seems wrong to query the widening decision at this point. Removing the
operand and code to get the widening decision causes no unit/regression
tests to fail. I've also found no issues running the LLVM test-suite.

This subsequently removes the VF argument from isPredicatedInst as well,
since it is no longer required.
2021-05-14 10:34:40 +01:00
Jay Foad 7f81c5a5ba [AMDGPU] getMemOperandsWithOffset: add vaddr operand for stack access BUF instructions
A consequence is that checkInstOffsetsDoNotOverlap can now distinguish
sp+offset from fp+offset, so it knows that it shouldn't try to work out
whether the accesses overlap just by comparing the offsets. For example
in these two instructions:

MIR:
BUFFER_STORE_DWORD_OFFSET %0:vgpr_32(s32), $sgpr0_sgpr1_sgpr2_sgpr3, $sgpr32, 4, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable store 4 into stack + 4, addrspace 5)
%4:vgpr_32 = BUFFER_LOAD_DWORD_OFFEN %stack.0.alloca, $sgpr0_sgpr1_sgpr2_sgpr3, $sgpr32, 0, 0, 0, 0, 0, 0, implicit $exec :: (load 4 from `i8 addrspace(5)* undef`, addrspace 5)

ISA:
buffer_store_dword v0, off, s[0:3], s32 offset:4
buffer_load_dword v0, off, s[0:3], s34

Differential Revision: https://reviews.llvm.org/D73957
2021-05-14 10:10:43 +01:00
Alexandros Lamprineas 1079870971 [llvm-mc][AArch64] HINT instruction disassembled as BTI
The Arm Architecture Reference Manual says that the SystemHintOp_BTI
opcode is prefered when CRm:op2 matches 0100:xx0, but llvm-mc
currently accepts 0100:xxx, which isn't right.

Differential Revision: https://reviews.llvm.org/D102415
2021-05-14 10:05:37 +01:00
Martin Storsjö c12c8124e1 [libcxx] [test] Change the generic_string_alloc test to test conversions to all char types
On windows, the native path char type is wchar_t - therefore, this test
didn't actually do the conversion that the test was supposed to exercise.

The charset conversions on windows do cause extra allocations outside of
the provided allocator though, so that bit of the test has to be waived
now that the test actually does something. (Other tests have similar
TEST_NOT_WIN32() for allocation checks for charset conversions.)

Also fix a typo, and amend the path.native.obs/string_alloc test to
test char8_t, too.

Differential Revision: https://reviews.llvm.org/D102360
2021-05-14 11:56:48 +03:00
Roman Lebedev 43a7f130a7
[X86] AMD Zen 3: same-reg AVX YMM VXORPD is a zero-cycle(!) dep-breaking zero-idiom
As confirmed by exegesis measurements, and ref docs.
2021-05-14 11:56:07 +03:00
Roman Lebedev 336b9dbe88
[X86] AMD Zen 3: same-reg AVX XMM VXORPD is a zero-cycle(!) dep-breaking zero-idiom
As confirmed by exegesis measurements, and ref docs.
2021-05-14 11:56:07 +03:00
Roman Lebedev 9c596bc541
[X86] AMD Zen 3: same-reg SSE XMM XORPD is a 1-cycle(!) dep-breaking zero-idiom
Same as with it's float friend, unlike their AVX versions.
As confirmed by exegesis, and ref docs.
2021-05-14 11:56:07 +03:00
Roman Lebedev 3567c7eda1
[NFC][X86][MCA] AMD Zen 3: add same-reg AVX YMM VXORPD tests 2021-05-14 11:56:07 +03:00
Roman Lebedev 57eee56d0a
[NFC][X86][MCA] AMD Zen 3: add same-reg AVX XMM VXORPD tests 2021-05-14 11:56:06 +03:00
Roman Lebedev fdc65e46b6
[NFC][X86][MCA] AMD Zen 3: add same-reg SSE XMM XORPD tests 2021-05-14 11:56:06 +03:00
Roman Lebedev 59554c01ab
[X86] AMD Zen 3: same-reg AVX YMM VXORPS is a zero-cycle(!) dep-breaking zero-idiom
As confirmed by exegesis, and ref docs.
2021-05-14 11:56:06 +03:00
Roman Lebedev 2a7c52ff7f
[NFC][X86][MCA] AMD Zen 3: add same-reg AVX YMM VXORPS tests 2021-05-14 11:56:06 +03:00
Roman Lebedev 26c1bffe67
[X86] AMD Zen 3: same-reg AVX XMM VXORPS is a zero-cycle(!) dep-breaking zero-idiom
Unlike it's legacy SSE XMM XORPS version, which measures as being 1-cycle,
this one is certainly a zero-cycle instruction, in addition to both of them
being dependency breaking.

As confirmed by exegesis measurements, and ref docs.
2021-05-14 11:56:06 +03:00
Roman Lebedev a9fb321a67
[NFC][X86][MCA] AMD Zen 3: add same-reg AVX XMM VXORPS tests 2021-05-14 11:56:06 +03:00
Pooja Yadav 4763c8c9e3 [docs] Added llvm/cmake section
Added information about the cmake inside llvm.

Reviewed By: xgupta, jroelofs

Differential Revision: https://reviews.llvm.org/D101925
2021-05-14 14:10:56 +05:30
David Stuttard 31b62aa162 [AMDGPU] Fix codegen of image intrinsics for g16 and a16
For gfx10 gradient (g16) and address (a16) can be independent. Previous
implementation assumed that a16 implied g16.

There are some other changes that fix the verification (as well as asm/disasm)
that are required for the included test to pass - the XFAIL will be removed in
those changes.

This also includes required fixes for GlobalISel

Differential Revision: https://reviews.llvm.org/D102066

Change-Id: I7d171cc90994de05f41669b66a6d0ffa2ed05d09
2021-05-14 09:28:15 +01:00
David Stuttard 72d570ca08 [AMDGPU][AsmParser/Disassembler] Correct A16 and G16 handling
A16 support for image instructions assembly/disassembly (gfx10) was missing

Also refactor MIMG op addr size calcs to common function

We'd got 3 places where the same operation was being done.

One test is now marked XFAIL until a related codegen patch is in place

Differential Revision: https://reviews.llvm.org/D102231

Change-Id: I7e86e730ef8c71901457855cba570581f4f576bb
2021-05-14 09:25:44 +01:00
David Spickett 2db090a2eb [llvm][AsmPrinter] Restore source location to register clobber warning
Since 5de2d189e6 this particular warning
hasn't had the location of the source file containing the inline
assembly.

Fix this by reporting via LLVMContext. Which means that we no longer
have the "instantiated into assembly here" lines but they were going to
point to the start of the inline asm string anyway.

This message is already tested via IR in llvm. However we won't have
the required location info there so I've added a C file test in clang
to cover it.
(though strictly, this is testing llvm code)

Reviewed By: ychen

Differential Revision: https://reviews.llvm.org/D102244
2021-05-14 08:22:57 +00:00
Alexey Bader 444f02d73c New tag for ittapi - fix an error related to cross-compiling ITTAPI in LLVM with mingw
Fix was implemented in the ittap repo to solve an error about cross-compiling ITTAPI in LLVM with mingw.
The problem occurred in the cross-compilation environment for Julia's dependencies.
The corresponding issue item in ittapi repo: https://github.com/intel/ittapi/issues/19
A new tag was created in ittapi repo for that fix.

This patch contains changes to update the ittapi tag in LLVM.

Reviewed By: bader

Differential Revision: https://reviews.llvm.org/D102471
2021-05-14 08:18:49 +03:00
dfukalov fdae3fc8b3 [GVN] Clobber partially aliased loads.
Use offsets stored in `AliasResult` implemented in D98718.

Updated with fix of issue reported in https://reviews.llvm.org/D95543#2745161

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D95543
2021-05-14 11:17:14 +03:00
David Green f7cb654763 [DSE] Move isOverwrite into DSEState. NFC
This moves the isOverwrite function into the DSEState so that it can
share the analyses and members from the state.

A few extra loop tests were also added to test stores in and around
multi block loops for D100464.
2021-05-14 09:16:51 +01:00
Lang Hames c82a0ae70e [ORC] Add JITLink dependence for ObjectLinkingLayerTest.
This aims to fix the failure at
https://lab.llvm.org/buildbot/#/builders/61/builds/9590.
2021-05-13 22:48:30 -07:00
LLVM GN Syncbot de115c3fb2 [gn build] Port 0fda4c4745 2021-05-14 04:56:03 +00:00
Lang Hames 0fda4c4745 [ORC] Add support for adding LinkGraphs directly to ObjectLinkingLayer.
This is separate from (but builds on) the support added in ec6b71df70 for
emitting LinkGraphs in the context of an active materialization. This commit
makes LinkGraphs a first-class data structure with features equivalent to
object files within ObjectLinkingLayer.
2021-05-13 21:44:13 -07:00
Lang Hames 9099c9ef78 [JITLink] Fix missing 'static' keyword in unit test. 2021-05-13 21:44:13 -07:00
Fangrui Song 261d6e05d5 [sanitizer] Simplify __sanitizer::BufferedStackTrace::UnwindImpl implementations
Intended to be NFC. D102046 relies on the refactoring for stack boundaries.
2021-05-13 21:26:31 -07:00
Carl Ritson 9cf6ff7aff [AMDGPU] Do not clause NSA instructions
To ensure correct behaviour NSA instructions should not be claused.

Reviewed By: foad

Differential Revision: https://reviews.llvm.org/D102211
2021-05-14 12:54:56 +09:00
Reid Kleckner d2f4b7d778 Use enum comparison instead of generated switch/case, NFC
Clang's coverage data for auto-generated switch cases is really, really
large. Before this change, when I enable code coverage, SemaDeclAttr.obj
is 4.0GB. Naturally, this fails to link.

Replacing the RISCV builtin id check with a comparison reduces object
file size from 4.0GB to 330MB. Replacing the AArch64 SVE range check
reduces the size again down to 17MB, which is reasonable.

I think the RISCV switch is larger in coverage data because it uses more
levels of macro expansion, while the SVE intrinsics only use one. In any
case, please try to avoid switches with 1000+ cases, they usually don't
optimize well.
2021-05-13 20:26:50 -07:00
Reid Kleckner ee23f8b36f [COFF] Remove a truncation assertion from setRVA
LLD already produces a nice error message when sections exceed 4GB, and
this setRVA assertion causes LLD to crash instead of diagnosing the
error properly.

No test because we don't want slow tests that create 4GB files.
2021-05-13 19:37:14 -07:00
Matthias Springer a088bed4e3 [mlir] VectorToSCF cleanup
Group functions/structs in namespaces for better code readability.

Depends On D102123

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D102124
2021-05-14 11:04:37 +09:00
Rahul Joshi 23a84e1c60 [MLIR] Fix build failures due to unused variables in non-debug builds.
Differential Revision: https://reviews.llvm.org/D102458
2021-05-13 18:42:48 -07:00
Lang Hames 65736ac439 [ORC] Remove the OrcExecutionTest class. It is no longer used. 2021-05-13 18:32:36 -07:00
Lang Hames 527bd6dc1c [ORC] Remove unused RTDyldObjectLinkingLayerExecutionTest class from unit test. 2021-05-13 18:32:35 -07:00
Lang Hames c76e3c319e [ORC] Remove some stale unit test utils.
This code was used to test ORCv1, which has been removed. It is not useful for
testing ORCv2.
2021-05-13 18:32:35 -07:00
Matthias Springer 2ca887de6e [mlir] VectorToSCF target rank is a pass option
Make "target rank" a pass option of VectorToSCF.

Depends On D102101

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D102123
2021-05-14 10:30:43 +09:00
Chen Zheng 61484762e9 [Debug-Info] change Tag type to dwarf::Tag for createAndAddDIE; NFC
Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D102207
2021-05-13 21:15:06 -04:00
Peter Collingbourne f79929acea scudo: Fix MTE error reporting for zero-sized allocations.
With zero-sized allocations we don't actually end up storing the
address tag to the memory tag space, so store it in the first byte of
the chunk instead so that we can find it later in getInlineErrorInfo().

Differential Revision: https://reviews.llvm.org/D102442
2021-05-13 18:14:03 -07:00
Peter Collingbourne 9567131d03 scudo: Check for UAF in ring buffer before OOB in more distant blocks.
It's more likely that we have a UAF than an OOB in blocks that are
more than 1 block away from the fault address, so the UAF should
appear first in the error report.

Differential Revision: https://reviews.llvm.org/D102379
2021-05-13 18:14:02 -07:00
Arthur Eubanks ab6a609d96 [test] Fix new-pm-lto-defaults.ll to work on all platforms
https://lab.llvm.org/buildbot/#/builders/119/builds/3775/steps/8/logs/FAIL__LLVM__new-pm-lto-defaults_ll

Followup to D102345.
2021-05-13 18:12:55 -07:00
H.J. Lu 72797dedb7 [sanitizer] Use size_t on g_tls_size to fix build on x32
On x32 size_t == unsigned int, not unsigned long int:

../../../../../src-master/libsanitizer/sanitizer_common/sanitizer_linux_libcdep.cpp: In function ??void __sanitizer::InitTlsSize()??:
../../../../../src-master/libsanitizer/sanitizer_common/sanitizer_linux_libcdep.cpp:209:55: error: invalid conversion from ??__sanitizer::uptr*?? {aka ??long unsigned int*??} to ??size_t*?? {aka ??unsigned int*??} [-fpermissive]
  209 |   ((void (*)(size_t *, size_t *))get_tls_static_info)(&g_tls_size, &tls_align);
      |                                                       ^~~~~~~~~~~
      |                                                       |
      |                                                       __sanitizer::uptr* {aka long unsigned int*}

by using size_t on g_tls_size.  This is to fix:

https://bugs.llvm.org/show_bug.cgi?id=50332

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D102446
2021-05-13 18:07:11 -07:00
Chen Zheng 75f3beeedf [Debug-Info] make DIE attributes generation under strict DWARF control
Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D101024
2021-05-13 20:34:07 -04:00