Commit Graph

397541 Commits

Author SHA1 Message Date
Jez Ng 2179930868 [lld-macho] Fix unwind info personality size
This was missed by {D107035}. This fix addresses the following warning:

  loop variable 'personality' has type 'const uint32_t &' (aka 'const unsigned int &') but is initialized with type 'const unsigned long long' resulting in a copy [-Wrange-loop-analysis]

In addition to fixing the size, I also removed the const reference,
since there's no performance benefit to avoiding copies of integer-sized
values.
2021-08-26 18:52:06 -04:00
Butygin 1e35a7690d [mlir][spirv] Initial support for 64 bit index type and builtins
Differential Revision: https://reviews.llvm.org/D108516
2021-08-27 01:38:53 +03:00
Benson Chu 7bd92f5911 [AST] Pick last tentative definition as the acting definition
Clang currently picks the second tentative definition when
VarDecl::getActingDefinition is called.

This can lead to attributes being dropped if they are attached to
tentative definitions that appear after the second one. This is
because VarDecl::getActingDefinition loops through VarDecl::redecls
assuming that the last tentative definition is the last element in the
iterator. However, it is the second element that would be the last
tentative definition.

This changeset modifies getActingDefinition to iterate through the
declaration chain in reverse, so that it can immediately return when
it encounters a tentative definition.

Originally the unit test for this changeset did not have a -triple
flag for the clang invocation, leading to this test being broken on
MacOS, since Mach-O does not support the section attribute.

Differential Revision: https://reviews.llvm.org/D99732
2021-08-26 16:49:54 -05:00
Arthur Eubanks 6eed1fb349 [clang][NewPM] Mention that legacy PM flags are deprecated
Differential Revision: https://reviews.llvm.org/D108789
2021-08-26 14:42:55 -07:00
Yonghong Song 82d9cb34a2 [DebugInfo] convert btf_tag attrs to DI annotations for func parameters
Generate btf_tag annotations for DILocalVariable. The annotations
are represented as an DINodeArray in DebugInfo.

Differential Revision: https://reviews.llvm.org/D106620
2021-08-26 14:27:58 -07:00
Fangrui Song a42bd1b560 [CMake] Change -DENABLE_EXPERIMENTAL_NEW_PASS_MANAGER=off to -DLLVM_ENABLE_NEW_PASS_MANAGER=off
LLVM_ENABLE_NEW_PASS_MANAGER is set to ENABLE_EXPERIMENTAL_NEW_PASS_MANAGER, so
-DLLVM_ENABLE_NEW_PASS_MANAGER=off has no effect.

Change the cache variable to LLVM_ENABLE_NEW_PASS_MANAGER instead.
A user opting out the new PM needs to switch from
-DENABLE_EXPERIMENTAL_NEW_PASS_MANAGER=off to
-DLLVM_ENABLE_NEW_PASS_MANAGER=off.

Also give a warning that -DLLVM_ENABLE_NEW_PASS_MANAGER=off is deprecated.

Reviewed By: aeubanks, phosek

Differential Revision: https://reviews.llvm.org/D108775
2021-08-26 14:25:31 -07:00
Yonghong Song 1bebc31c61 [DebugInfo] generate btf_tag annotations for func parameters
Generate btf_tag annotations for function parameters.
A field "annotations" is introduced to DILocalVariable, and
annotations are represented as an DINodeArray, similar to
DIComposite elements. The following example illustrates how
annotations are encoded in IR:
    distinct !DILocalVariable(name: "info",, arg: 1, ..., annotations: !10)
    !10 = !{!11, !12}
    !11 = !{!"btf_tag", !"a"}
    !12 = !{!"btf_tag", !"b"}

Differential Revision: https://reviews.llvm.org/D106620
2021-08-26 14:18:30 -07:00
Artem Dergachev 7309359928 [analyzer] Fix scan-build report deduplication.
The previous behavior was to deduplicate reports based on md5 of the
html file. This algorithm might have worked originally but right now
HTML reports contain information rich enough to make them virtually
always distinct which breaks deduplication entirely.

The new strategy is to (finally) take advantage of IssueHash - the
stable report identifier provided by clang that is the same if and only if
the reports are duplicates of each other.

Additionally, scan-build no longer performs deduplication on its own.
Instead, the report file name is now based on the issue hash,
and clang instances will silently refuse to produce a new html file
when a duplicate already exists. This eliminates the problem entirely.

The '-analyzer-config stable-report-filename' option is deprecated
because report filenames are no longer unstable. A new option is
introduced, '-analyzer-config verbose-report-filename', to produce
verbose file names that look similar to the old "stable" file names.
The old option acts as an alias to the new option.

Differential Revision: https://reviews.llvm.org/D105167
2021-08-26 13:34:29 -07:00
Kirill Stoimenov a3f4139626 [asan] Implemented flag to emit intrinsics to optimize ASan callbacks.
Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D108377
2021-08-26 20:33:57 +00:00
Kirill Stoimenov 2e83a0efb9 [asan] Fixed a runtime crash.
Looks like the NoRegister has some effect on the final code that is generated. My guess is that some optimization kicks in at the end?

When I use -S to dump the assembly I get the correct version with 'shrq    $3, %r8':
        movq    %r9, %r8
        shrq    $3, %r8
        movsbl  2147450880(%r8), %r8d

But, when I disassemble the final binary I get RAX in stead of R8:
        mov    %r9,%r8
        shr    $0x3,%rax
        movsbl 0x7fff8000(%r8),%r8d

Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D108745
2021-08-26 20:30:25 +00:00
Rob Suderman 90478251c7 [mlir][tosa] Tosa reverse to linalg supporting dynamic shapes
Needed to switch to extract to support tosa.reverse using dynamic shapes.

Reviewed By: NatashaKnk

Differential Revision: https://reviews.llvm.org/D108744
2021-08-26 13:23:59 -07:00
Alexey Bataev 84cbd71c95 [SLP]Improve graph reordering.
Reworked reordering algorithm. Originally, the compiler just tried to
detect the most common order in the reordarable nodes (loads, stores,
extractelements,extractvalues) and then fully rebuilding the graph in
the best order. This was not effecient, since it required an extra
memory and time for building/rebuilding tree, double the use of the
scheduling budget, which could lead to missing vectorization due to
exausted scheduling resources.

Patch provide 2-way approach for graph reodering problem. At first, all
reordering is done in-place, it doe not required tree
deleting/rebuilding, it just rotates the scalars/orders/reuses masks in
the graph node.

The first step (top-to bottom) rotates the whole graph, similarly to the previous
implementation. Compiler counts the number of the most used orders of
the graph nodes with the same vectorization factor and then rotates the
subgraph with the given vectorization factor to the most used order, if
it is not empty. Then repeats the same procedure for the subgraphs with
the smaller vectorization factor. We can do this because we still need
to reshuffle smaller subgraph when buildiong operands for the graph
nodes with lasrger vectorization factor, we can rotate just subgraph,
not the whole graph.

The second step (bottom-to-top) scans through the leaves and tries to
detect the users of the leaves which can be reordered. If the leaves can
be reorder in the best fashion, they are reordered and their user too.
It allows to remove double shuffles to the same ordering of the operands in
many cases and just reorder the user operations instead. Plus, it moves
the final shuffles closer to the top of the graph and in many cases
allows to remove extra shuffle because the same procedure is repeated
again and we can again merge some reordering masks and reorder user nodes
instead of the operands.

Also, patch improves cost model for gathering of loads, which improves
x264 benchmark in some cases.

Gives about +2% on AVX512 + LTO (more expected for AVX/AVX2) for {625,525}x264,
+3% for 508.namd, improves most of other benchmarks.
The compile and link time are almost the same, though in some cases it
should be better (we're not doing an extra instruction scheduling
anymore) + we may vectorize more code for the large basic blocks again
because of saving scheduling budget.

Differential Revision: https://reviews.llvm.org/D105020
2021-08-26 12:31:18 -07:00
Nikita Popov 8441a8eea8 [MergeICmps] Add test for call before first load (NFC)
If a clobbering call happens before all loads, that shouldn't
block the transform.
2021-08-26 21:24:22 +02:00
Arthur Eubanks 14d45e41bf [test] Update precommit tests for D108734 2021-08-26 12:05:56 -07:00
Vitaly Buka 96fa1eaae4 [sanitizer] Add basic qsort test 2021-08-26 12:03:26 -07:00
Jon Chesterfield 3d85342982 [libomptarget][amdgpu][nfc] Rename variables, delete dead code 2021-08-26 19:58:38 +01:00
Andrea Di Biagio 44a13f33be Revert "[MCA][NFC] Remove redundant calls to std::move."
This reverts commit 9cc0023fb8.
due to buildbot failures.
2021-08-26 19:53:17 +01:00
Siva Chandra Reddy 004c7b1da6 [libc][NFC] Move the mutex implementation into a utility class.
This allows others parts of the libc to use the mutex types without
actually pulling in public function implementations.

Along the way, few cleanups have been done, like using a uniform type to
refer the linux futex word.

Reviewed By: michaelrj

Differential Revision: https://reviews.llvm.org/D108749
2021-08-26 18:49:20 +00:00
Andrea Di Biagio 9cc0023fb8 [MCA][NFC] Remove redundant calls to std::move.
This fixes some redundant move in return statement [-Wredundant-move] gcc 9.3.0
warnings.

This also fixes a minor coverity issue reported agaist class MCAOperand about
the lack of proper initialization for field Index.

No functional change intended.
2021-08-26 19:47:59 +01:00
Jessica Paquette 2363a20001 [AArch64][GlobalISel] Optimize G_BUILD_VECTOR of undef + 1 elt -> SUBREG_TO_REG
This pattern

```
%elt = ... something ...
%undef = G_IMPLICIT_DEF
%vec = G_BUILD_VECTOR %elt, %undef, %undef, ... %undef
```

Can be selected to a SUBREG_TO_REG, assuming `%elt` and `%vec` have the same
register bank. We don't care about any of the bits in `%vec` aside from those
in `%elt`, which just happens to be the 0th element.

This is preferable to emitting `mov` instructions for every index.

This gives minor code size improvements on the test suite at -Os.

Differential Revision: https://reviews.llvm.org/D108773
2021-08-26 11:45:11 -07:00
RamNalamothu 9b9e7f6f4e [docs, AMDGPU] Fix typo in dwarf register number mapping
Reviewed By: xgupta

Differential Revision: https://reviews.llvm.org/D108557
2021-08-26 23:55:29 +05:30
Yaron Keren 1958575859 [docs] Update Getting Started with Visual Studio guide
Update this document for 2021.

Reviewed By: aaron.ballman, kuhnel, amccarth

Differential Revision: https://reviews.llvm.org/D108513
2021-08-26 21:21:36 +03:00
Rob Suderman 0600bb4d18 [mlir][tosa] Elementwise operation dynamic shape support
Added dynamic shape support for elementwise operations. This assumes equal
sizes (broadcasting 1-length dynamic is problematic).

Reviewed By: NatashaKnk

Differential Revision: https://reviews.llvm.org/D108730
2021-08-26 11:18:58 -07:00
Louis Dionne 19e806e88d [libc++][NFC] Sort headers alphabetically 2021-08-26 14:18:09 -04:00
Shafik Yaghmour 2a4a498a88 [LLDB] Add type to the output for FieldDecl when logging in ClangASTSource::layoutRecordType
I was debugging a problem and noticed that it would have been helpful to have
the type of each FieldDecl when looking at the output from
ClangASTSource::layoutRecordType.

Differential Revision: https://reviews.llvm.org/D108257
2021-08-26 11:17:47 -07:00
Andrea Di Biagio 1eb75362c9 [MCA][RegisterFile] Consistently update the PRF in the presence of multiple writes to the same register.
My last change to the RegisterFile (PR51495) has introduced a bug in the logic
that allocates physical registers in the PRF.

In some cases, this bug could have triggered a nasty unsigned wrap in the number
of allocated registers, thus resulting in mca being stuck forever in a loop of
PRF availability checks.
2021-08-26 19:16:20 +01:00
LLVM GN Syncbot 9ade9d9ac1 [gn build] Port ee44dd8062 2021-08-26 18:08:07 +00:00
Louis Dionne ee44dd8062 [libc++] Implement the underlying mechanism for range adaptors
This patch implements the underlying mechanism for range adaptors. It
does so based on http://wg21.link/p2387, even though that paper hasn't
been adopted yet. In the future, if p2387 is adopted, it would suffice
to rename `__bind_back` to `std::bind_back` and `__range_adaptor_closure`
to `std::range_adaptor_closure` to implement that paper by the spec.

Differential Revision: https://reviews.llvm.org/D107098
2021-08-26 14:07:21 -04:00
Michael Jones 035325275c [libc] add inttypes header
Add inttypes.h to llvm libc. As its first functions strtoimax and
strtoumax are included.

Reviewed By: sivachandra

Differential Revision: https://reviews.llvm.org/D108736
2021-08-26 18:04:21 +00:00
Alexey Bataev dc94761f3b [SLP][NFC]Add a test for correct shuffles order after reordering. 2021-08-26 10:37:09 -07:00
Yonghong Song d2d7a90ced [DebugInfo] convert btf_tag attrs to DI annotations for DIGlobalVariable
Generate btf_tag annotations for DIGlobalVariable. The annotations
are represented as an DINodeArray in DebugInfo.

Differential Revision: https://reviews.llvm.org/D106619
2021-08-26 10:36:33 -07:00
Walter Erquinigo c62ef0255d [NFC] Removing deprecated intel-features test folder
This folder has no valid tests anymore
2021-08-26 10:36:00 -07:00
Sanjay Patel 038704c43b [GlobalOpt] add tests for constant expressions that can trap; NFC
https://llvm.org/PR47578
2021-08-26 13:34:31 -04:00
Walter Erquinigo 600a2a7ec0 [NFC] Remove deprecated Intel PT test 2021-08-26 10:34:04 -07:00
Jon Chesterfield 68ab93f4d7 [libomptarget][amdgpu][nfc] Rename source files 2021-08-26 18:29:44 +01:00
Louis Dionne a4357bc496 [libc++] Fix incorrect bypassing of <wctype.h>
Differential Revision: https://reviews.llvm.org/D108709
2021-08-26 13:26:31 -04:00
Vitaly Buka 39100c82d3 [NFC][sanitizer] Swap qsort_r and qsort code
To simplify future review.
2021-08-26 10:24:59 -07:00
Louis Dionne f640c31e4b [libc++] XFAIL align.pass.cpp for PowerPC LE
This patch XFAILs the `align.pass.cpp` for PowerPC (LE).

It appears that this test will fail on Power for the `LLIArr2` and `Padding` structs within the test,
as the `assert` for `alignof(AtomicImpl) >= sizeof(AtomicImpl)` will be false. In this case, these structs
presumably should not be lock-free, so we currently XFAIL this for now.

The failure was discovered after D97913 was committed. It looks like `alignof(AtomicImpl) < sizeof(AtomicImpl)`,
even prior to this commit, but this test began running on Power after D97913, whereas we were
not running `align.pass.cpp` before.

This patch addresses https://bugs.llvm.org/show_bug.cgi?id=51548 by temporarily XFAILing the test
in order to investigate it further.

Differential Revision: https://reviews.llvm.org/D108668
2021-08-26 13:21:40 -04:00
Craig Topper 1b9417454e [RISCV] Insert a sext_inreg when type legalizing i32 shl by constant on RV64.
Similar to what we do for add/sub/mul.

This can help remove some sext.w. There are some regressions on
some bswap tests, but I have an idea how to fix that for a follow up.

A new PACKW pattern is added to handle the new sext_inreg placement.

Differential Revision: https://reviews.llvm.org/D108663
2021-08-26 10:20:19 -07:00
Fangrui Song abb956370e [CMake] Enable LLVM_ENABLE_PER_TARGET_RUNTIME_DIR by default on Linux
This makes the default build closer to a -DLLVM_ENABLE_RUNTIMES=all build.
The layout is arguably superior because different libraries of target triples
are in different directories, similar to GCC/Debian multiarch.

When LLVM_DEFAULT_TARGET_TRIPLE is x86_64-unknown-linux-gnu,
`lib/clang/14.0.0/lib/libclang_rt.asan-x86_64.a`
becomes
`lib/clang/14.0.0/lib/x86_64-unknown-linux-gnu/libclang_rt.asan.a`.

Clang has been detecting both paths since 2018 (D50547).

---

Note: Darwin needs to be disabled. The hierarchy needs to be sorted out.
The current -DLLVM_DEFAULT_TARGET_TRIPLE=off state is like:
```
lib/clang/14.0.0/lib/darwin/libclang_rt.profile_ios.a
lib/clang/14.0.0/lib/darwin/libclang_rt.profile_iossim.a
lib/clang/14.0.0/lib/darwin/libclang_rt.profile_osx.a
```

Windows needs to be disabled: https://reviews.llvm.org/D107799?id=368557#2963311

Differential Revision: https://reviews.llvm.org/D107799
2021-08-26 10:13:16 -07:00
Yonghong Song 30c288489a [DebugInfo] generate btf_tag annotations for DIGlobalVariable
Generate btf_tag annotations for DIGlobalVariable.
A field "annotations" is introduced to DIGlobalVariable, and
annotations are represented as an DINodeArray, similar to
DIComposite elements. The following example illustrates how
annotations are encoded in IR:
    distinct !DIGlobalVariable(..., annotations: !10)
    !10 = !{!11, !12}
    !11 = !{!"btf_tag", !"a"}
    !12 = !{!"btf_tag", !"b"}

Differential Revision: https://reviews.llvm.org/D106619
2021-08-26 10:03:44 -07:00
RamNalamothu be19aee4b2 [DWARFLinker] Prefix debug section names with '.' in the comments. NFC.
In DWARFLinker.h, some comments prefix the debug section names
with '.' while others do not.

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D108519
2021-08-26 22:29:21 +05:30
Aaron Ballman a233f0350d Typo fix; NFC 2021-08-26 12:53:52 -04:00
Aaron Ballman 0cf4f81082 Adding an assertion back.
This assert was removed in 98339f14a0,
but during post-commit review, it was pointed out that the assert was
valid.
2021-08-26 12:51:14 -04:00
Andrew Litteken 9d2c859ebb [CodeExtractor] Making the arguments outlined easier to access from the outside
The Code Extractor does not provide an easy mechanism for determining the
inputs and outputs after extraction has occurred, this patch gives the
ability to pass in empty SetVectors to be filled with the inputs and
outputs if they need to be analyzed.

Added Tests:
- InputOutputMonitoring in unittests/Transforms/Utils/CodeExtractorTests.cpp

Reviewers: paquette

Differential Revision: https://reviews.llvm.org/D106991
2021-08-26 09:47:53 -07:00
Luís Marques 34e055d33e [Clang][RISCV] Implement getConstraintRegister for RISC-V
The getConstraintRegister method is used by semantic checking of inline
assembly statements in order to diagnose conflicts between clobber list
and input/output lists. By overriding getConstraintRegister we get those
diagnostics and we match RISC-V GCC's behavior. The implementation is
trivial due to the lack of single-register RISC-V-specific constraints.

Differential Revision: https://reviews.llvm.org/D108624
2021-08-26 17:43:43 +01:00
Stanislav Mekhanoshin 827dd17e26 [AMDGPU] Invert partial vgpr to agpr spill lane order
On targets requiring VGPR alignment we may end up spilling an
unaligned register if we were partially spilled odd number of
leading lanes. The reminder will start with an odd register.

This problem is solved by inverting the order of lanes to
be spillied so that we start from the end.

Differential Revision: https://reviews.llvm.org/D108732
2021-08-26 09:39:03 -07:00
Craig Topper 8bb24289f3 [SelectionDAG] Optimize bitreverse expansion to minimize the number of mask constants.
We can halve the number of mask constants by masking before shl
and after srl.

This can reduce the number of mov immediate or constant
materializations. Or reduce the number of constant pool loads
for X86 vectors.

I think we might be able to do something similar for bswap. I'll
look at it next.

Differential Revision: https://reviews.llvm.org/D108738
2021-08-26 09:33:24 -07:00
LLVM GN Syncbot 70f3ccb6a2 [gn build] Port 1076082a0d 2021-08-26 16:28:53 +00:00
Jon Chesterfield a5f4074d85 [libomptarget][amdgpu] Macro for accessing GPU variables from plugin
Lets the amdgpu plugin write to omptarget_device_environment
to enable debugging. Intend to use in the near future to record the
wavesize that a given deviceRTL was compiled with for running on hardware
that supports 32 or 64.

Patch sets all the attributes that are useful. Notably .data means the variable
is set by writing to host memory before copying to the GPU instead of launching
a kernel to update the image. Can simplify the plugin slightly to drop the
code for patching after load if this is used consistently.

NFC on nvptx, cuda plugin seems to work fine without any annotations.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D108698
2021-08-26 17:28:18 +01:00