Commit Graph

406735 Commits

Author SHA1 Message Date
Kazu Hirata 630c847b1b [llvm] Use range-based for loops (NFC) 2021-12-07 09:17:03 -08:00
Corentin Jabot 2334314550 Do not check if we are in a discared context in non-immediate contexts
This fixes in a regression introduced by 6eeda06c1.

When deducing the return type of nested function calls, only the
return type of the outermost expression should be ignored.

Instead of assuming all contextes nested in a discared statements
are themselves discarded, only assume that in immediate contexts.

Similarly, only consider contextes immediately in an immediate or
discarded statement as being themselves immediate.
2021-12-07 12:13:35 -05:00
LLVM GN Syncbot 9779972311 [gn build] Port fa99cb64ff 2021-12-07 17:01:16 +00:00
Mircea Trofin fa99cb64ff [mlgo][regalloc] Add score calculation for training
Add the calculation of a score, which will be used during ML training. The
score qualifies the quality of a regalloc policy, and is independent of
what we train (currently, just eviction), or the regalloc algo itself.
We can then use scores to guide training (which happens offline), by
formulating a reward based on score variation - the goal being lowering
scores (currently, that reward is percentage reduction relative to
Greedy's heuristic)

Currently, we compute the score by factoring different instruction
counts (loads, stores, etc) with the machine basic block frequency,
regardless of the instructions' provenance - i.e. they could be due to
the regalloc policy or be introduced previously. This is different from
RAGreedy::reportStats, which accummulates the effects of the allocator
alone. We explored this alternative but found (at least currently) that
the more naive alternative introduced here produces better policies. We
do intend to consolidate the two, however, as we are actively
investigating improvements to our reward function, and will likely want
to re-explore scoring just the effects of the allocator.

In either case, we want to decouple score calculation from allocation
algorighm, as we currently evaluate it after a few more passes after
allocation (also, because score calculation should be reusable
regardless of allocation algorithm).

We intentionally accummulate counts independently because it facilitates
per-block reporting, which we found useful for debugging - for instance,
we can easily report the counts indepdently, and then cross-reference
with perf counter measurements.

Differential Revision: https://reviews.llvm.org/D115195
2021-12-07 09:00:27 -08:00
Aaron Ballman a18632adc8 Add diagnostic groups for attribute extensions
Some users have a need to control attribute extension diagnostics
independent of other extension diagnostics. Consider something like use
of [[nodiscard]] within C++11:
```
[[nodiscard]]
int f();
```
If compiled with -Wc++17-extensions enabled, this will produce warning:
use of the 'nodiscard' attribute is a C++17 extension. This diagnostic
is correct -- using [[nodiscard]] in C++11 mode is a C++17 extension.
And the behavior of __has_cpp_attribute(nodiscard) is also correct --
we support [[nodiscard]] in C++11 mode as a conforming extension. But
this makes use of -Werror or -pedantic-errors` builds more onerous.

This patch adds diagnostic groups for attribute extensions so that
users can selectively disable attribute extension diagnostics. I
believe this is preferable to requiring users to specify additional
flags because it means -Wc++17-extensions continues to be the way we
enable all C++17-related extension diagnostics. It would be quite easy
for someone to use that flag thinking they're protected from some
portability issues without realizing it skipped attribute extensions if
we went the other way.

This addresses PR33518.
2021-12-07 11:49:53 -05:00
spupyrev dc97349505 fixing a broken ext-tsp test
the test requires debug build

example of a failed buildbot:
https://lab.llvm.org/buildbot/#/builders/91/builds/211/steps/8/logs/stdio

Differential Revision: https://reviews.llvm.org/D115255
2021-12-07 08:37:24 -08:00
Florian Hahn e9a2944495
[VPlan] Verify plan entry and exit blocks, set correct exit block.
Both the entry and exit blocks of the top-region of a plan must be
VPBasicBlocks. They also must have no predecessors or successors
respectively.

This invariant was broken when splitting a block for sink-after. To fix
the issue, set the exit block of the region *after* sink-after is done.

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D114586
2021-12-07 16:26:31 +00:00
Jay Foad 077a14e00b [AMDGPU] Mark time intrinsics as nomem, hassideeffects
Adding IntrHasSideEffects to @llvm.amdgcn.s.memtime and
@llvm.amdgcn.s.memrealtime means that we can stop pretending they read
and write memory, and similarly for the corresponding pseudo
instructions.

This should stop these intrinsics from being rescheduled past all other
instructions, even ones which don't load or store.

See also https://reviews.llvm.org/D58635.

Differential Revision: https://reviews.llvm.org/D115227
2021-12-07 16:24:06 +00:00
Peter Klausler 398dffd4ff [flang] Fix INQUIRE(FILE=,NAME=)
The file name output was not being copied back to the program
from the runtime.

Differential Revision: https://reviews.llvm.org/D115190
2021-12-07 08:17:08 -08:00
Sanjay Patel 8a69b04478 [InstSimplify] add logic fold for 'or' with 'xor'+'and'
This replaces the 'or' from 4b30076f16 with an 'and'.
We have to guard against propagating undef elements from
vector 'not' values:
https://alive2.llvm.org/ce/z/irMwRc
2021-12-07 11:08:26 -05:00
Sanjay Patel 4b48cdd4dd [InstCombine] add tests for rem with select operand; NFC 2021-12-07 11:08:26 -05:00
Craig Topper 2a9b2444d9 [RISCV] Replace uses of RISCVOpcode<0b0010011> and RISCVOpcode<0b0011011> with existing named objects. NFC
These are already instantiated with names as OPC_OP_IMM and
OPC_OP_IMM_32.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D115172
2021-12-07 08:07:14 -08:00
Lei Zhang 7709b23bef [mlir][scf] NFC: create dedicated files for affine utils
These functions are generic utility functions that operates on
affine ops within SCF regions. Moving them to their own files
for a better code structure, instead of mixing with loop
specialization logic.

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D115245
2021-12-07 10:55:32 -05:00
LLVM GN Syncbot 0fc2e6d390 [gn build] Port f573f6866e 2021-12-07 15:31:37 +00:00
spupyrev f573f6866e ext-tsp basic block layout
A new basic block ordering improving existing MachineBlockPlacement.

The algorithm tries to find a layout of nodes (basic blocks) of a given CFG
optimizing jump locality and thus processor I-cache utilization. This is
achieved via increasing the number of fall-through jumps and co-locating
frequently executed nodes together. The name follows the underlying
optimization problem, Extended-TSP, which is a generalization of classical
(maximum) Traveling Salesmen Problem.

The algorithm is a greedy heuristic that works with chains (ordered lists)
of basic blocks. Initially all chains are isolated basic blocks. On every
iteration, we pick a pair of chains whose merging yields the biggest increase
in the ExtTSP value, which models how i-cache "friendly" a specific chain is.
A pair of chains giving the maximum gain is merged into a new chain. The
procedure stops when there is only one chain left, or when merging does not
increase ExtTSP. In the latter case, the remaining chains are sorted by
density in decreasing order.

An important aspect is the way two chains are merged. Unlike earlier
algorithms (e.g., based on the approach of Pettis-Hansen), two
chains, X and Y, are first split into three, X1, X2, and Y. Then we
consider all possible ways of gluing the three chains (e.g., X1YX2, X1X2Y,
X2X1Y, X2YX1, YX1X2, YX2X1) and choose the one producing the largest score.
This improves the quality of the final result (the search space is larger)
while keeping the implementation sufficiently fast.

Differential Revision: https://reviews.llvm.org/D113424
2021-12-07 07:31:10 -08:00
Kirill Bobyrev 976a74d7d2 [clangd] Dex Trigrams: Improve query trigram generation
These are the trigrams for queries right now:

- "va" -> {Trigram("va")}
- "va_" -> {} (empty)

This is suboptimal since the resulting query will discard the query information
and return all symbols, some of which will be later be scored expensively
(fuzzy matching score). This is related to
https://github.com/clangd/clangd/issues/39 but does not fix it. Accidentally,
because of that incorrect behavior, when user types "tok::va" there are no
results (the issue is that `tok::kw___builtin_va_arg` does not have "va" token)
but when "tok::va_" is typed, expected result (`tok::kw___builtin_va_arg`)
shows up by accident. This is because the dex query transformer will only
lookup symbols within the `tok::` namespace. There won't be many, so the
returned results will contain symbol we need; this symbol will be filtered out
by the expensive checks and that will be displayed in the editor.

Reviewed By: sammccall

Differential Revision: https://reviews.llvm.org/D113995
2021-12-07 15:46:13 +01:00
gbreynoo 9094a2285b [llvm-symbolizer][docs] Update --output-style=JSON example
The fields output when using --output-style=JSON has changed but the
guide wasn't updated. This change fixes up the example.

Differential Revision: https://reviews.llvm.org/D115164
2021-12-07 14:21:18 +00:00
David Green 1f2e4125fb [ARM] Additional tests for qr instructions with constant operands. NFC 2021-12-07 14:18:32 +00:00
Florian Hahn 22e6094b20
[EarlyCSE] Add test case with inbounds gep where flags can be retained. 2021-12-07 13:46:25 +00:00
Florian Hahn aca7a19039
[EarlyCSE] Auto-generate check lines for flags.ll.
The test already checks the full IR. To make updating easier,
auto-generate the check lines.
2021-12-07 13:46:13 +00:00
Carlos Galvez d40130199f [doc] Fix namespace comment style in Coding Guidelines
The Coding Guidelines specify that the ending brace of a
namespace shall have a comment like:

}  // end namespace clang

However the majority of the code uses a different style:

}  // namespace clang

Indeed:

$ git grep '// end' | wc -l
6724
$ git grep '// namespace' | wc -l
14348

Besides, this is the style enforced automatically by clang-format,
via the FixNamespaceComments option.

Having inconsistencies between the Coding Guidelines and the
code/tooling creates confusion, can lead to bikeshedding during
reviews and overall delays merging code. Therefore, update the
guidelines to reflect current usage. Updating legacy code to the
new standard should be done in a separate patch, if wanted.

Reviewed By: jyknight

Differential Revision: https://reviews.llvm.org/D115115
2021-12-07 13:36:25 +00:00
Pavel Labath d4083a296a [lldb] Fix flakyness in TestQemuLaunch.test_stdio_redirect
The test was flaky because it was trying to read from the (redirected)
stdout file before the data was been flushed to it. This would not be a
problem for a "normal" debug session, but since here the emulator and
the target binary coexist in the same process (and this is true both for
real qemu and our fake implementation), there
is a window of time between the stub returning an exit packet (which is
the event that the test is waiting for) and the process really exiting
(which is when the normal flushing happens).

This patch adds an explicit flush to work around this. Theoretically,
it's possible that real code could run into this issue as well, but such
a use case is not very likely. If we wanted to fix this for real, we
could add some code which waits for the host process to terminate (in
addition to receiving the termination packet), but this is somewhat
complicated by the fact that this code lives in the gdb-remote process
plugin.
2021-12-07 14:19:44 +01:00
Pavel Labath 611fdde4c7 [lldb/qemu] Add emulator-args setting
This setting allows the user to pass additional arguments to the qemu instance.
While we may want to introduce dedicated settings for the most common qemu
arguments (-cpu, for one), having this setting allows us to avoid creating a
setting for every possible argument.

Differential Revision: https://reviews.llvm.org/D115151
2021-12-07 14:19:43 +01:00
Nicolas Vasilache 61ba9f9110 [mlir][Linalg] NFC - Extend the TilingInterface to allow better composition with out-of-tree dialects.
Reviewed By: gysit

Differential Revision: https://reviews.llvm.org/D115233
2021-12-07 13:06:27 +00:00
Djordje Todorovic f0f6bba5b2 [MIPS] Add FPU Delay Slot for MIPS1/2/3
MIPS I, II, and III have delay slots for floating point
comparisons and floating point register transfers (mtc1, mfc1).
Currently, these are not taken into account and thus broken code
may be generated on these targets. This patch inserts nops
as necessary, while attempting to leave the current instruction
if it is safe to stay.

The tests in this patch were updated by @sajattack

Patch by @overdrivenpotato (Marko Mijalkovic <marko.mijalkovic97@gmail.com>)

Differential Revision: https://reviews.llvm.org/D115127
2021-12-07 05:02:20 -08:00
Aaron Ballman 7d5315fc4c Fix Sphinx formatting in release notes 2021-12-07 07:56:40 -05:00
Matthias Springer 8a232632c5 [mlir][linalg][bufferize] Add FuncOp bufferization pass
This passes bufferizes FuncOp bodies, but not FuncOp boundaries.

Differential Revision: https://reviews.llvm.org/D114671
2021-12-07 21:44:26 +09:00
Louis Dionne e7f53ec78f [libc++] Bump Dockerfile 2021-12-07 07:30:46 -05:00
Louis Dionne c49a13a45a [libc++] Fix atomic test for _BitInt
In 6c75ab5f66, Clang deprecated _ExtInt in favor of _BitInt, which
made this test fail. This patch disables the test on older compilers
and uses the new _BitInt type instead.

Differential Revision: https://reviews.llvm.org/D115194
2021-12-07 07:30:09 -05:00
Andrew Savonichev 420300c0d8 [MCA] Remove the warning about experimental support for in-order CPU
There are not a lot of bug reports for this feature, so let's mark it
stable.

Differential Revision: https://reviews.llvm.org/D114701
2021-12-07 15:27:51 +03:00
Andrew Savonichev e29ba97d23 [NVPTX] Auto-generate tests for sufrace and texture instructions
The patch adds LIT tests for SULD, SUST, TEX and TLD4 instructions as
a follow up for D112232. There are a number of FIXME marks that
highlight possible bugs or missed instruction variants.

Differential Revision: https://reviews.llvm.org/D114367
2021-12-07 15:27:51 +03:00
Paulo Matos 2fd634a5e3 [WebAssembly] Implement table instruction intrinsics
This change implements intrinsics for table.grow, table.fill,
table.size, and table.copy.

Differential Revision: https://reviews.llvm.org/D113420
2021-12-07 13:25:59 +01:00
Peter Waller ed43aab98d [AArch64][SVE] Fix fptrunc store for fixed len vector
Restrict duplicate FP_EXTEND/FP_TRUNC -> LOAD/STORE DAG combines to only
larger than NEON types, as these are the ones for which there is custom
lowering.

Update tests so that they go through memory to improve validation.

Differential Revision: https://reviews.llvm.org/D115166
2021-12-07 12:22:07 +00:00
Simon Pilgrim 2925f3c9ae [X86] LowerRotate - pull out repeated splitVectorIntBinary call. NFC. 2021-12-07 12:05:33 +00:00
David Spickett 6bfbb89e96 [compiler-rt][libFuzzer] Disable counters test on arm
This test is either very slow or loops forever on 32 bit Arm.

One of a few tests causing timeouts on our buildbots:
https://lab.llvm.org/buildbot/#/builders/190/builds/513
2021-12-07 11:55:11 +00:00
Matthias Springer 4ccbf1d2fb [mlir][linalg][bufferize] Fix forward declaration 2021-12-07 20:13:24 +09:00
Fraser Cormack 40d51de5cb [SelectionDAG] Use UnknownSize for VP memory ops
In the style of D113888, this patch updates the various VP memory
operations (load, store, gather, scatter) to use UnknownSize. This is
for the same reason as for masked loads and stores: the number of
elements accessed is not generally known at compile time.

This is somewhat pessimistic in the sense that we may still find
un-canonicalized intrinsics featuring both an all-true mask and an EVL
equal to the vector size. Arguably those should be canonicalized before
the SelectionDAG, so those have been left for future work.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D115036
2021-12-07 10:51:02 +00:00
Matthias Springer 958ae8b2d4 [mlir][linalg][bufferize] Bufferize Operation* instead of FuncOp
This change mainly changes the API. There is no mentioning of FuncOps in ComprehensiveBufferize anymore.

Also, bufferize methods of the op interface are called for ops without tensor operands/results if they have a region.

Differential Revision: https://reviews.llvm.org/D115212
2021-12-07 19:53:44 +09:00
Florian Hahn 718a1c989a
[X86] Add test where block placement separates call from RV marker.
The test shows how block placement can separate a call from the marker
instruction and the ObjC call after CALL_RVMARKER expansion.
2021-12-07 10:52:54 +00:00
Cullen Rhodes 0395e01583 [IR] Split vscale_range interface
Interface is split from:

  std::pair<unsigned, unsigned> getVScaleRangeArgs()

into separate functions for min/max:

  unsigned getVScaleRangeMin();
  Optional<unsigned> getVScaleRangeMax();

Reviewed By: sdesmalen, paulwalker-arm

Differential Revision: https://reviews.llvm.org/D114075
2021-12-07 10:38:26 +00:00
Fraser Cormack 3460cc2585 [VP] Propagate align parameter attr on VP load/store to ISel
This patch fixes a case where the 'align' parameter attribute on the
pointer operands to llvm.vp.load and llvm.vp.store was being dropped
during the conversion to the SelectionDAG. The default alignment
equal to the ABI type alignment of the vector type was kept. It also
updates the documentation to reflect the fact that the parameter
attribute is now properly supported.

Reviewed By: simoll

Differential Revision: https://reviews.llvm.org/D114422
2021-12-07 10:16:16 +00:00
Jaroslav Sevcik f72ae5cba1 [lldb] Fix windows path guessing for root paths
Fix recognizing "<letter>:\" as a windows path.

Differential Revision: https://reviews.llvm.org/D115104
2021-12-07 11:16:04 +01:00
Ties Stuij 63eb7ff47d [ARM] Implement PAC return address signing mechanism for PACBTI-M
This patch implements PAC return address signing for armv8-m. This patch roughly
accomplishes the following things:

- PAC and AUT instructions are generated.
- They're part of the stack frame setup, so that shrink-wrapping can move them
inwards to cover only part of a function
- The auth code generated by PAC is saved across subroutine calls so that AUT
can find it again to check
- PAC is emitted before stacking registers (so that the SP it signs is the one
on function entry).
- The new pseudo-register ra_auth_code is mentioned in the DWARF frame data
- With CMSE also in use: PAC is emitted before stacking FPCXTNS, and AUT
validates the corresponding value of SP
- Emit correct unwind information when PAC is replaced by PACBTI
- Handle tail calls correctly

Some notes:

We make the assembler accept the `.save {ra_auth_code}` directive that is
emitted by the compiler when it saves a register that contains a
return address authentication code.

For EHABI we need to have the `FrameSetup` flag on the instruction and
handle the `t2PACBTI` opcode (identically to `t2PAC`), so we can emit
`.save {ra_auth_code}`, instead of `.save {r12}`.

For PACBTI-M, the instruction which computes return address PAC should use SP
value before adjustment for the argument registers save are (used for variadic
functions and when a parameter is is split between stack and register), but at
the same it should be after the instruction that saves FPCXT when compiling a
CMSE entry function.

This patch moves the varargs SP adjustment after the FPCXT save (they are never
enabled at the same time), so in a following patch handling of the `PAC`
instruction can be placed between them.

Epilogue emission code adjusted in a similar manner.

PACBTI-M code generation should not emit any instructions for architectures
v6-m, v8-m.base, and for A- and R-class cores. Diagnostic message for such cases
is handled separately by a future ticket.

note on tail calls:

If the called function has four arguments that occupy registers `r0`-`r3`, the
only option for holding the function pointer itself is `r12`, but this register
is used to keep the PAC during function/prologue epilogue and clobbers the
function pointer.

When we do the tail call we need the five registers (`r0`-`r3` and `r12`) to
keep six values - the four function arguments, the function pointer and the PAC,
which is obviously impossible.

One option would be to authenticate the return address before all callee-saved
registers are restored, so we have a scratch register to temporarily keep the
value of `r12`. The issue with this approach is that it violates a fundamental
invariant that PAC is computed using CFA as a modifier. It would also mean using
separate instructions to pop `lr` and the rest of the callee-saved registers,
which would offset the advantages of doing a tail call.

Instead, this patch disables indirect tail calls when the called function take
four or more arguments and the return address sign and authentication is enabled
for the caller function, conservatively assuming the caller function would spill
LR.

This patch is part of a series that adds support for the PACBTI-M extension of
the Armv8.1-M architecture, as detailed here:

https://community.arm.com/arm-community-blogs/b/architectures-and-processors-blog/posts/armv8-1-m-pointer-authentication-and-branch-target-identification-extension

The PACBTI-M specification can be found in the Armv8-M Architecture Reference
Manual:

https://developer.arm.com/documentation/ddi0553/latest

The following people contributed to this patch:

- Momchil Velikov
- Ties Stuij

Reviewed By: danielkiss

Differential Revision: https://reviews.llvm.org/D112429
2021-12-07 10:15:19 +00:00
David Spickett db490ad385 [llvm][X86] Add x86 triple to fentry test
When run on AArch64 hardware llc would default to native
arch.
2021-12-07 10:03:29 +00:00
Jay Foad 47d15170f6 [AMDGPU] Remove redundant mayLoad = 0, mayStore = 0. NFC.
Almost everything in this file is mayLoad = 0, mayStore = 0 by
default anyway.
2021-12-07 09:55:05 +00:00
Cullen Rhodes 698584f89b [IR] Remove unbounded as possible value for vscale_range minimum
The default for min is changed to 1. The behaviour of -mvscale-{min,max}
in Clang is also changed such that 16 is the max vscale when targeting
SVE and no max is specified.

Reviewed By: sdesmalen, paulwalker-arm

Differential Revision: https://reviews.llvm.org/D113294
2021-12-07 09:52:21 +00:00
Gabor Marton 978431e80b [Analyzer] SValBuilder: Simlify a SymExpr to the absolute simplest form
Move the SymExpr simplification fixpoint logic into SValBuilder.

Differential Revision: https://reviews.llvm.org/D114938
2021-12-07 10:02:32 +01:00
Vitaly Buka fc3a260a0f [sanitizer] Don't lock for StackStore::Allocated() 2021-12-07 01:00:01 -08:00
Vitaly Buka 7151c71481 [sanitizer] Fix CompressStackStore VPrint message 2021-12-07 01:00:01 -08:00
LLVM GN Syncbot 7d9f11be81 [gn build] Port ae53d02f55 2021-12-07 08:10:43 +00:00