Commit Graph

393525 Commits

Author SHA1 Message Date
Hongtao Yu cda2394d97 [NFC][CSSPGO] Rename the name of an enum value. 2021-07-13 18:30:16 -07:00
Hongtao Yu 74b99b5c2e [CSSPGO] Do not import pseudo probe desc in thinLTO
Previously we reliedy on pseudo probe descriptors to look up precomputed GUID during probe emission for inlined probes. Since we are moving to always using unique linkage names, GUID for functions can be computed in place from dwarf names. This eliminates the need of importing pseudo probe descs in thinlto, since those descs should be emitted by the original modules.

This significantly reduces thinlto memory footprint in some extreme case where the number of imported modules for a single module is massive.

Test Plan:

Reviewed By: wenlei

Differential Revision: https://reviews.llvm.org/D105248
2021-07-13 18:26:36 -07:00
Hongtao Yu 0712038458 [CSSPGO][llvm-profgen] Allow multiple executable load segments.
The linker or post-link optimizer can create an ELF image with multiple executable segments each of which will be loaded separately at run time. This breaks the assumption of llvm-profgen that currently only supports one base load address. What it ends up with is that the subsequent mmap events will be treated as an overwrite of the first mmap event which will in turn screw up address mapping. While it is non-trivial to support multiple separate load addresses and given that on x64 those segments will always be loaded at consecutive addresses (though via separate mmap
sys calls), I'm adding an error checking logic to bail out if that's violated and keep using a single load address which is the address of the first executable segment.

Also changing the disassembly output from printing section offset to printing the virtual address instead, which matches the behavior of objdump.

Differential Revision: https://reviews.llvm.org/D103178
2021-07-13 18:22:24 -07:00
Vitaly Buka 35ce66330a [NFC][sanitizer] Simplify MapPackedCounterArrayBuffer 2021-07-13 18:16:28 -07:00
Jon Roelofs 87c6bf92a9 [AArch64] rm unused subreg's 2021-07-13 18:06:31 -07:00
Jon Roelofs 6377388c32 [AArch64] Fix AArch64::dsub's size 2021-07-13 18:06:31 -07:00
Arthur Eubanks 5738819679 Revert "[SCEV] Handle zero stride correctly in howManyLessThans"
This reverts commit 4df591b5c9.

Causes crashes, see comments on D105921.
2021-07-13 17:53:48 -07:00
Vitaly Buka ed430023e8 Revert "[NFC][sanitizer] Simplify MapPackedCounterArrayBuffer"
Does not compile.

This reverts commit 8725b382b0.
2021-07-13 17:43:33 -07:00
Jessica Paquette 5bd7cc4f42 [AArch64][GlobalISel] Mark v2s64 -> v2p0 G_INTTOPTR as legal
Allow

```
%x:_<2 x p0> = G_INTTOPTR %y:_<2 x s64>
```

This shows up when building clang for AArch64 with GlobalISel.

Also show that we can select it.

This should match SDAG's behaviour: https://godbolt.org/z/33oqYoaYv

Differential Revision: https://reviews.llvm.org/D105944
2021-07-13 17:28:14 -07:00
Vitaly Buka 8725b382b0 [NFC][sanitizer] Simplify MapPackedCounterArrayBuffer 2021-07-13 17:26:07 -07:00
Vitaly Buka 7140382b17 [NFC][sanitizer] Move MemoryMapper template parameter 2021-07-13 16:52:22 -07:00
Matt Arsenault 3191ac27e3 AMDGPU: Try to fix test failure with EXPENSIVE_CHECKS
The machine verifier is enabled by default for EXPENSIVE_CHECKS, so
the pass runs of it would pollute the output here.
2021-07-13 19:34:14 -04:00
Dmitry Vyukov 832ba20710 sanitizer_common: optimize memory drain
Currently we allocate MemoryMapper per size class.
MemoryMapper mmap's and munmap's internal buffer.
This results in 50 mmap/munmap calls under the global
allocator mutex. Reuse MemoryMapper and the buffer
for all size classes. This radically reduces number of
mmap/munmap calls. Smaller size classes tend to have
more objects allocated, so it's highly likely that
the buffer allocated for the first size class will
be enough for all subsequent size classes.

Reviewed By: melver

Differential Revision: https://reviews.llvm.org/D105778
2021-07-13 16:28:48 -07:00
Arthur Eubanks 0024ec59a0 [NewPM][SimpleLoopUnswitch] Add option to not trivially unswitch
To help with debugging non-trivial unswitching issues.

Don't care about the legacy pass, nobody is using it.

If a pass's string params are empty (e.g. "simple-loop-unswitch"), don't
default to the empty constructor for the pass params. We should still
let the parser take care of it in case the parser has its own defaults.

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D105933
2021-07-13 16:09:42 -07:00
Vitaly Buka 99aebb62fb [NFC][sanitizer] Don't store region_base_ in MemoryMapper
Part of D105778
2021-07-13 15:59:36 -07:00
Matt Arsenault eebe841a47 RegAlloc: Allow targets to split register allocation
AMDGPU normally spills SGPRs to VGPRs. Previously, since all register
classes are handled at the same time, this was problematic. We don't
know ahead of time how many registers will be needed to be reserved to
handle the spilling. If no VGPRs were left for spilling, we would have
to try to spill to memory. If the spilled SGPRs were required for exec
mask manipulation, it is highly problematic because the lanes active
at the point of spill are not necessarily the same as at the restore
point.

Avoid this problem by fully allocating SGPRs in a separate regalloc
run from VGPRs. This way we know the exact number of VGPRs needed, and
can reserve them for a second run.  This fixes the most serious
issues, but it is still possible using inline asm to make all VGPRs
unavailable. Start erroring in the case where we ever would require
memory for an SGPR spill.

This is implemented by giving each regalloc pass a callback which
reports if a register class should be handled or not. A few passes
need some small changes to deal with leftover virtual registers.

In the AMDGPU implementation, a new pass is introduced to take the
place of PrologEpilogInserter for SGPR spills emitted during the first
run.

One disadvantage of this is currently StackSlotColoring is no longer
used for SGPR spills. It would need to be run again, which will
require more work.

Error if the standard -regalloc option is used. Introduce new separate
-sgpr-regalloc and -vgpr-regalloc flags, so the two runs can be
controlled individually. PBQB is not currently supported, so this also
prevents using the unhandled allocator.
2021-07-13 18:49:29 -04:00
Eli Friedman bb8c7a980f [ScalarEvolution] Make isKnownNonZero handle more cases.
Using an unsigned range instead of signed ranges is a bit more precise.

Differential Revision: https://reviews.llvm.org/D105941
2021-07-13 15:36:45 -07:00
Vitaly Buka afa3fedcda [NFC][sanitizer] Exctract DrainHalfMax
Part of D105778
2021-07-13 15:33:22 -07:00
Vitaly Buka 5df9995439 [NFC][sanitizer] Rename some MemoryMapper members
Part of D105778
2021-07-13 15:33:22 -07:00
Geoffrey Martin-Noble 9955c652ea [NFC][MLIR][std] Clean up ArithmeticCastOps
The documentation on these was out of sync with the implementation. Also
the declaration of inputs was repeated when it is already part of the
ArithmeticCastOp definition.

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D105934
2021-07-13 15:18:45 -07:00
Victor Huang 18c19414eb [PowerPC] Add PowerPC compare and multiply related builtins and instrinsics for XL compatibility
This patch is in a series of patches to provide builtins for compatibility
with the XL compiler. This patch adds the builtins and instrisics for compare
and multiply related operations.

Reviewed By: nemanjai, #powerpc

Differential revision: https://reviews.llvm.org/D102875
2021-07-13 16:55:09 -05:00
MaheshRavishankar f2b5e438aa [mlir][Tensor] Implement `reifyReturnTypeShapesPerResultDim` for `tensor.insert_slice`.
Differential Revision: https://reviews.llvm.org/D105852
2021-07-13 14:53:29 -07:00
Aart Bik 123e8dfcf8 [mlir][sparse] add support for std unary operations
Adds zero-preserving unary operators from std. Also adds xor.
Performs minor refactoring to remove "zero" node, and pushed
the irregular logic for negi (not support in std) into one place.

Reviewed By: gussmith23

Differential Revision: https://reviews.llvm.org/D105928
2021-07-13 14:51:13 -07:00
Adam Paszke 8a2720d81e Add more types to the LLVM dialect C API
This includes:
- void type
- array types
- function types
- literal (unnamed) struct types

Reviewed By: jpienaar, ftynse

Differential Revision: https://reviews.llvm.org/D105908
2021-07-13 14:35:50 -07:00
Derek Schuff d4e2693a67 [WebAssembly] Run varargs codegen test with non-emscripten triple
This is a followup from D105749 to cover both triples in the case
where they differ.
2021-07-13 14:31:19 -07:00
Alexander Yermolovich 24129fbc9a [LLD] Adding support for RELA for CG Profile.
This is a follow up to https://reviews.llvm.org/D104080, and ca3bdb57fa (diff-e64a48fabe31db213a631fdc5f2acb51bdddf3f16a8fb2928784f4c579229585). The implementation of  call graph profile was changed from a black box section to relocation approach. This was done to be compatible with post processing tools like strip/objcopy, and llvm equivalent. When they are invoked on object file before the final linking step with this new approach the symbol indices correctness is preserved.

The GNU binutils tools change the REL section to RELA section, unlike llvm tools. For example when strip -S is run on the ELF object files, as an intermediate step before linking. To preserve compatibility this patch extends implementation in LLD and ELFDumper to support both REL and RELA sections for call graph profile.

Reviewed By: MaskRay, jhenderson

Differential Revision: https://reviews.llvm.org/D105217
2021-07-13 13:56:30 -07:00
Hedin Garca a5a337e55e [libc] Capture floating point encoding and arrange it sequentially in memory
Redefined FPBits.h and LongDoubleBitsX86 so its implementation works for the Windows
and Linux platform while maintaining a packed memory alignment of the precision floating
point numbers. For its size in memory to be the same as the data type of the float point number.
This change was necessary because the previous attribute((packed)) specification in the struct was not working
for Windows like it was for Linux and consequently static_asserts in the FPBits.h file were failing.

Reviewed By: aeubanks, sivachandra

Differential Revision: https://reviews.llvm.org/D105561
2021-07-13 20:43:54 +00:00
Caitlyn Cano a16071e409 [libc] Don't pass -fpie/-ffreestanding on Windows
The current compile options function hardcodes the -fpie and
-ffreestanding flags, which don't exist on Windows. This patch sets the
compilation flags conditionally based on the OS specifics.

Reviewed By: sivachandra, aeubanks

Differential Revision: https://reviews.llvm.org/D105643
2021-07-13 20:39:51 +00:00
Vitaly Buka f990da59c5 [sanitizer] Few more NFC changes from D105778 2021-07-13 13:38:13 -07:00
Philip Reames 4df591b5c9 [SCEV] Handle zero stride correctly in howManyLessThans
This is split from D105216, but the code is hoisted much earlier into the path where we can actually get a zero stride flowing through. Some fairly simple proofs handle the cases which show up in practice. The only test changes are the cases where we really do need a non-zero divider to produce the right result.

Differential Revision: https://reviews.llvm.org/D105921
2021-07-13 13:31:40 -07:00
Martin Storsjö 1c69005c2e [libcxx] [docs] Acknowledge that the library is known to work in some configs outside of what's tested in CI
Differential Revision: https://reviews.llvm.org/D105888
2021-07-13 23:18:55 +03:00
Vitaly Buka 9f1f666b30 [NFC][sanitizer] Move MemoryMapper out of SizeClassAllocator64
Part of D105778
2021-07-13 13:17:36 -07:00
Hedin Garca d12a7f142e [libc] Add on float properties for precision floating point numbers in FloatProperties.h
Defined constant that express the number of bits for exponent in single and double precision. Added bit masks values and other properties for quad precision floating point numbers that specifically targets architectures defined in PlatfromDefs.h. The exponentWidth values were added to be used in LongDoubleBitsX86.h where the implementation to set the exponent component uses this and the bitWidth value. The need occurred because of the 80-bit quad precision implementation.

Reviewed By: aeubanks

Differential Revision: https://reviews.llvm.org/D105153
2021-07-13 20:15:54 +00:00
Vedant Kumar 5105a77035 [docs/llvm-cov] Document -compilation-dir
Document the `-compilation-dir` option added in D100232.

Differential Revision: https://reviews.llvm.org/D105826
2021-07-13 13:10:02 -07:00
Vitaly Buka d558bfaf8e [NFC][sanitizer] clang-format part of D105778 2021-07-13 13:02:23 -07:00
Vitaly Buka ba8dcaef0d Revert "sanitizer_common: optimize memory drain"
Breaks https://lab.llvm.org/buildbot/#/builders/anitizer-windows

This reverts commit d89d3dfae1.
2021-07-13 12:58:57 -07:00
Arthur O'Dwyer 7efe388785 [libc++] [test] Add a missing `()` in TestEachIntegralType. 2021-07-13 15:57:43 -04:00
Hafiz Abid Qadeer fb9c5c3dce [lld][AMDGPU] Handle R_AMDGPU_REL16 relocation.
This patch is a followup patch to https://reviews.llvm.org/D105760 which adds this relocation. This handles the relocation in lld.

The s_branch family of instruction does the following:
PC = PC + signext(simm * 4) + 4

so we we do the opposite on the target address before writing it in the instruction stream.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D105761
2021-07-13 20:41:11 +01:00
thomasraoux 6296e10972 [mlir][Vector] Remove Vector TupleOp as it is unused
TupleOp is not used anymore after recent refactoring.

Differential Revision: https://reviews.llvm.org/D105924
2021-07-13 12:39:12 -07:00
Eli Friedman b28c465e49 [NFC] Use CHECK-LABEL in trip-count-unknown-stride.ll 2021-07-13 12:21:13 -07:00
Eli Friedman 5d1ba53404 [LoopReroll] Add an extra defensive check to avoid SCEV assertion.
Make sure getMinusSCEV() didn't return a pointer.  The following check
would never succeed if it was a pointer, anyway, but calling
getMulExpr() on a pointer SCEV now asserts.
2021-07-13 12:17:09 -07:00
Nico Weber 3ea8860afb [gn build] (manually) port 303ddb60a2 2021-07-13 15:15:38 -04:00
Philip Reames 5ca9cf0e6b [tests] Precommit a test case from D105216 2021-07-13 12:02:44 -07:00
Artem Belevich 25629bb45f Fix cuda-bad-arch.cu test.
Tests for correctness of HIP architecture need `- xhip`
2021-07-13 11:57:25 -07:00
Philip Reames 087310c71e [SCEV] Strengthen inference of RHS > Start in howManyLessThans
Split off from D105216 to simplify review.  Rewritten with a lambda to be easier to follow.  Comments clarified.

Sorry for no test case, this is tricky to exercise with the current structure of the code.  It's about to be hit more frequently in a follow up patch, and the change itself is simple.
2021-07-13 11:54:07 -07:00
Jon Roelofs 3e5cff19fd [Tests] Fix test broken by: 43c7ca8e49 [AArch64][GlobalISel] Legalize store <2 x i16> 2021-07-13 11:36:36 -07:00
Krishna Kariya e56b2e5706 [InstCombine] Precommit tests for D105088 (NFC)
Add tests for D105088, as well as an option to disable the
(generally) unsound inttoptr of ptrtoint optimization.

Differential Revision: https://reviews.llvm.org/D105771
2021-07-13 20:35:04 +02:00
Thomas Lively 308d381283 [WebAssembly] Generate checks for simd-load-store-alignment.ll
This will make it easier to update these tests as we add support for generating
more SIMD loads and stores with custom alignments.

Differential Revision: https://reviews.llvm.org/D105862
2021-07-13 11:25:33 -07:00
Victor Huang 781929b423 [PowerPC][NFC] Power ISA features for Semachecking
[NFC] This patch adds features for pwr7, pwr8, and pwr9 that can be
used for semachecking builtin functions that are only valid for certain
versions of ppc.

Reviewed By: nemanjai, #powerpc
Authored By: Quinn Pham <Quinn.Pham@ibm.com>

Differential revision: https://reviews.llvm.org/D105501
2021-07-13 13:13:34 -05:00
Victor Huang e4585d3f4e Revert "[PowerPC][NFC] Power ISA features for Semachecking"
This reverts commit 10e0cdfc65.
2021-07-13 13:13:34 -05:00