Commit Graph

397501 Commits

Author SHA1 Message Date
Yonghong Song 30c288489a [DebugInfo] generate btf_tag annotations for DIGlobalVariable
Generate btf_tag annotations for DIGlobalVariable.
A field "annotations" is introduced to DIGlobalVariable, and
annotations are represented as an DINodeArray, similar to
DIComposite elements. The following example illustrates how
annotations are encoded in IR:
    distinct !DIGlobalVariable(..., annotations: )
     = !{!11, !12}
     = !{!"btf_tag", !"a"}
     = !{!"btf_tag", !"b"}

Differential Revision: https://reviews.llvm.org/D106619
2021-08-26 10:03:44 -07:00
RamNalamothu be19aee4b2 [DWARFLinker] Prefix debug section names with '.' in the comments. NFC.
In DWARFLinker.h, some comments prefix the debug section names
with '.' while others do not.

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D108519
2021-08-26 22:29:21 +05:30
Aaron Ballman a233f0350d Typo fix; NFC 2021-08-26 12:53:52 -04:00
Aaron Ballman 0cf4f81082 Adding an assertion back.
This assert was removed in 98339f14a0,
but during post-commit review, it was pointed out that the assert was
valid.
2021-08-26 12:51:14 -04:00
Andrew Litteken 9d2c859ebb [CodeExtractor] Making the arguments outlined easier to access from the outside
The Code Extractor does not provide an easy mechanism for determining the
inputs and outputs after extraction has occurred, this patch gives the
ability to pass in empty SetVectors to be filled with the inputs and
outputs if they need to be analyzed.

Added Tests:
- InputOutputMonitoring in unittests/Transforms/Utils/CodeExtractorTests.cpp

Reviewers: paquette

Differential Revision: https://reviews.llvm.org/D106991
2021-08-26 09:47:53 -07:00
Luís Marques 34e055d33e [Clang][RISCV] Implement getConstraintRegister for RISC-V
The getConstraintRegister method is used by semantic checking of inline
assembly statements in order to diagnose conflicts between clobber list
and input/output lists. By overriding getConstraintRegister we get those
diagnostics and we match RISC-V GCC's behavior. The implementation is
trivial due to the lack of single-register RISC-V-specific constraints.

Differential Revision: https://reviews.llvm.org/D108624
2021-08-26 17:43:43 +01:00
Stanislav Mekhanoshin 827dd17e26 [AMDGPU] Invert partial vgpr to agpr spill lane order
On targets requiring VGPR alignment we may end up spilling an
unaligned register if we were partially spilled odd number of
leading lanes. The reminder will start with an odd register.

This problem is solved by inverting the order of lanes to
be spillied so that we start from the end.

Differential Revision: https://reviews.llvm.org/D108732
2021-08-26 09:39:03 -07:00
Craig Topper 8bb24289f3 [SelectionDAG] Optimize bitreverse expansion to minimize the number of mask constants.
We can halve the number of mask constants by masking before shl
and after srl.

This can reduce the number of mov immediate or constant
materializations. Or reduce the number of constant pool loads
for X86 vectors.

I think we might be able to do something similar for bswap. I'll
look at it next.

Differential Revision: https://reviews.llvm.org/D108738
2021-08-26 09:33:24 -07:00
LLVM GN Syncbot 70f3ccb6a2 [gn build] Port 1076082a0d 2021-08-26 16:28:53 +00:00
Jon Chesterfield a5f4074d85 [libomptarget][amdgpu] Macro for accessing GPU variables from plugin
Lets the amdgpu plugin write to omptarget_device_environment
to enable debugging. Intend to use in the near future to record the
wavesize that a given deviceRTL was compiled with for running on hardware
that supports 32 or 64.

Patch sets all the attributes that are useful. Notably .data means the variable
is set by writing to host memory before copying to the GPU instead of launching
a kernel to update the image. Can simplify the plugin slightly to drop the
code for patching after load if this is used consistently.

NFC on nvptx, cuda plugin seems to work fine without any annotations.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D108698
2021-08-26 17:28:18 +01:00
Alexandre Rames 1076082a0d [Support]: Introduce the `HashBuilder` interface.
The `HashBuilder` interface allows conveniently building hashes of various data
types, without relying on the underlying hasher type to know about hashed data
types.

Reviewed By: dexonsmith

Differential Revision: https://reviews.llvm.org/D106910
2021-08-26 09:20:50 -07:00
Alexey Bataev b00f73d8bf Revert "[SLP]Improve graph reordering."
This reverts commit a28234e37a to
investigate a compiler crash caused by the commit.
2021-08-26 09:19:40 -07:00
Balazs Benics af79f1bff9 [analyzer] Extend the documentation of MallocOverflow
Previously by following the documentation it was not immediately clear
what the capabilities of this checker are.

In this patch, I add some clarification on when does the checker issue a
report and what it's limitations are.
I'm also advertising suppressing such reports by adding an assertion, as
demonstrated by the test3().
I'm highlighting that this checker might produce an extensive amount of
findings, but it might be still useful for code audits.

Reviewed By: martong

Differential Revision: https://reviews.llvm.org/D107756
2021-08-26 18:15:10 +02:00
Kazu Hirata cce49dcb85 [IR] Remove addPseudoProbeAttribute (NFC)
The last use was removed on Jun 17, 2021 in commit
bd52495518.
2021-08-26 09:02:26 -07:00
Simon Wallis c4dc81eeab [AArch64] provide strictfp attributes in test file
A post-commit review comment on  https://reviews.llvm.org/D107452 pointed out that
https://llvm.org/docs/LangRef.html
says:
"In a function that uses the constrained intrinsics the strictfp attribute is required on all function calls."

Although there are several files across several test directories which don't follow this guidance, it is straightforward to provide this attribute.

Reviewed By: kpn

Differential Revision: https://reviews.llvm.org/D107567
2021-08-26 16:56:43 +01:00
Yonghong Song 2de051ba12 [DebugInfo] convert btf_tag attrs to DI annotations for DISubprograms
Generate btf_tag annotations for DISubprograms. The annotations
are represented as an DINodeArray in DebugInfo.

Differential Revision: https://reviews.llvm.org/D106618
2021-08-26 08:54:11 -07:00
Roman Lebedev a8125bf4a8
[X86][Codegen] PR51615: don't replace wide volatile load with narrow broadcast-from-memory
Even though https://bugs.llvm.org/show_bug.cgi?id=51615
appears to be introduced by D105390, the fix lies here.

We can not replace a wide volatile load with a broadcast-from-memory,
because that would narrow the load, which isn't legal for volatiles.

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D108757
2021-08-26 18:46:49 +03:00
Florian Hahn 0bcfd4cbac
[ConstraintElimination] Rewrite tests to reduce verification complexity.
This patch reduces the bitwidth of types certain tests operate and gets
rid of a number of @use(i1) calls and xor's the conditions together
instead, which eliminates all timeouts when verifying the tests.
See https://github.com/AliveToolkit/alive2/issues/744 for more details.
2021-08-26 16:41:40 +01:00
Anna Thomas 55bdb14026 [LoopPredication] Preserve MemorySSA
Since LICM has now unconditionally moved to MemorySSA based form, all
passes that run in same LPM as LICM need to preserve MemorySSA (i.e. our
downstream pipeline).

Added loop-mssa to all tests and perform -verify-memoryssa within
LoopPredication itself.

Differential Revision: https://reviews.llvm.org/D108724
2021-08-26 11:36:25 -04:00
Arthur O'Dwyer 15acca5ccd [libc++] Revert a use of `static_cast` for `_VSTD::forward`. NFCI.
As requested in D107584.

Differential Revision: https://reviews.llvm.org/D108743
2021-08-26 11:35:07 -04:00
Yonghong Song d383df32c0 [DebugInfo] generate btf_tag annotations for DISubprogram types
Generate btf_tag annotations for DISubprogram types.
A field "annotations" is introduced to DISubprogram, and
annotations are represented as an DINodeArray, similar to
DIComposite elements. The following example illustrates how
annotations are encoded in IR:
    distinct !DISubprogram(..., annotations: )
     = !{!11, !12}
     = !{!"btf_tag", !"a"}
     = !{!"btf_tag", !"b"}

Differential Revision: https://reviews.llvm.org/D106618
2021-08-26 08:24:19 -07:00
Andrew Wei c9066c5d37 [CGP] Fix the crash for combining address mode when having cyclic dependency
In the combination of addressing modes, when replacing the matched phi nodes,
sometimes the phi node to be replaced has been modified. For example,
there’s matcher set [A, B] and [C, A], which will have cyclic dependency:
A is replaced by B and C will be replaced by A. Because we tried to match new phi node
to another new phi node, we should ignore new phi nodes when mapping new phi node to old one.

Reviewed By: skatkov

Differential Revision: https://reviews.llvm.org/D108635
2021-08-26 22:52:42 +08:00
Jacob Bramley 05f3219b38 [AArch64] Lower fpto*i.sat intrinsics for NEON.
Following on from D102353, extend the fpto*i.sat intrinsics to use NEON
fcvt* instructions.

Differential Revision: https://reviews.llvm.org/D108460
2021-08-26 15:37:00 +01:00
Joe Loser 231cf0e881 [libc++][NFC] Fix typo in test/support/test_range.h
Fix typo in `#error` filepath.

Differential Revision: https://reviews.llvm.org/D108764
2021-08-26 10:34:50 -04:00
Kent Ross 3fe7dde5f1 [libc++][doc] Cleanup, normalize, and update projects status docs
Mark the now-done [cmp.result] in spaceship projects as complete;
normalize some status markers for papers and projects; fix alignment
and line breaks in spaceship projects, add links to standard

Differential Revision: https://reviews.llvm.org/D108502
2021-08-26 10:33:52 -04:00
Alexey Bataev a28234e37a [SLP]Improve graph reordering.
Reworked reordering algorithm. Originally, the compiler just tried to
detect the most common order in the reordarable nodes (loads, stores,
extractelements,extractvalues) and then fully rebuilding the graph in
the best order. This was not effecient, since it required an extra
memory and time for building/rebuilding tree, double the use of the
scheduling budget, which could lead to missing vectorization due to
exausted scheduling resources.

Patch provide 2-way approach for graph reodering problem. At first, all
reordering is done in-place, it doe not required tree
deleting/rebuilding, it just rotates the scalars/orders/reuses masks in
the graph node.

The first step (top-to bottom) rotates the whole graph, similarly to the previous
implementation. Compiler counts the number of the most used orders of
the graph nodes with the same vectorization factor and then rotates the
subgraph with the given vectorization factor to the most used order, if
it is not empty. Then repeats the same procedure for the subgraphs with
the smaller vectorization factor. We can do this because we still need
to reshuffle smaller subgraph when buildiong operands for the graph
nodes with lasrger vectorization factor, we can rotate just subgraph,
not the whole graph.

The second step (bottom-to-top) scans through the leaves and tries to
detect the users of the leaves which can be reordered. If the leaves can
be reorder in the best fashion, they are reordered and their user too.
It allows to remove double shuffles to the same ordering of the operands in
many cases and just reorder the user operations instead. Plus, it moves
the final shuffles closer to the top of the graph and in many cases
allows to remove extra shuffle because the same procedure is repeated
again and we can again merge some reordering masks and reorder user nodes
instead of the operands.

Also, patch improves cost model for gathering of loads, which improves
x264 benchmark in some cases.

Gives about +2% on AVX512 + LTO (more expected for AVX/AVX2) for {625,525}x264,
+3% for 508.namd, improves most of other benchmarks.
The compile and link time are almost the same, though in some cases it
should be better (we're not doing an extra instruction scheduling
anymore) + we may vectorize more code for the large basic blocks again
because of saving scheduling budget.

Differential Revision: https://reviews.llvm.org/D105020
2021-08-26 07:19:07 -07:00
Kent Ross 5d993d3bc5 [libc++][doc] Repair files with CRLF line endings.
These are the only files in libc++ that have CRLF line endings instead of LF.

Differential Revision: https://reviews.llvm.org/D108748
2021-08-26 10:09:07 -04:00
Simon Pilgrim c17f5afa88 [X86] getShape - don't dereference dyn_cast<>
dyn_cast can return nullptr, use cast<> to assert we have the correct type.
2021-08-26 15:08:13 +01:00
Simon Pilgrim 47f2affa08 Fix MSVC "result of 32-bit shift implicitly converted to 64 bits" warning. NFCI. 2021-08-26 15:08:12 +01:00
Balazs Benics 379b6394d9 Revert "[analyzer] Extend the documentation of MallocOverflow"
This reverts commit 6097a41924.
2021-08-26 15:29:32 +02:00
Balazs Benics 6097a41924 [analyzer] Extend the documentation of MallocOverflow
Previously by following the documentation it was not immediately clear
what the capabilities of this checker are.

In this patch, I add some clarification on when does the checker issue a
report and what it's limitations are.
I'm also advertising suppressing such reports by adding an assertion, as
demonstrated by the test3().
I'm highlighting that this checker might produce an extensive amount of
findings, but it might be still useful for code audits.

Reviewed By: martong

Differential Revision: https://reviews.llvm.org/D107756
2021-08-26 15:20:41 +02:00
Andrew Wei 99c4336374 [LoopDataPrefetch] Add missed LoopSimplify dependence for prefetch pass
SCEVExpander::expandCodeFor may expand add recurrences for loop with a preheader,
so we should make LoopDataPrefetch dependent on LoopSimplify.
This patch will try to fix : https://bugs.llvm.org/show_bug.cgi?id=43784

Reviewed By: Meinersbur

Differential Revision: https://reviews.llvm.org/D108448
2021-08-26 21:01:59 +08:00
Jessica Clarke 8f89e2f6c9 [AMDGPU] Remove dead and broken ComplexPatterns
SelectADDRParam was discovered as being dead 5 years ago and removed in
7b4ef068c6 but the unused ComplexPattern definition was left behind.
SelectADDRDWord has never existed as far as I can tell, even back when
AMDGPU was R600-only and called that.

Reviewed By: foad

Differential Revision: https://reviews.llvm.org/D108758
2021-08-26 12:48:32 +01:00
Jessica Clarke 2cbdf7e131 [SelectionDAG] Remove unused SDTConvertOp
This was used by CONVERT_RNDSAT, which was removed in def496c04b, so
the profile is now unused.

Reviewed By: xgupta

Differential Revision: https://reviews.llvm.org/D108508
2021-08-26 12:48:17 +01:00
Andrea Di Biagio 4a5b191703 [X86][MCA] Address the latest issues with MULX reported in PR51495.
It turns out that SchedWrite WriteIMulH was always assigned to the low half of
the result of a MULX (rather than to the high half).

To avoid confusion, this patch swaps the two MULX writes in the tablegen
definition of MULX32/64.  That way, write names better describe what they
actually refer to; this also avoids further complications if in future we decide
to reuse the same MulH writes to also model other scalar integer multiply
instructions.  I also had to swap the latency values for the two MULX writes to
make sure that the change is effectively an NFC. In fact, none of the existing
x86 tests were affected by this small refactoring.

This patch also fixes a bug in MCA: a wrong latency value was propagated for
instructions that perform multiple writes to a same register.  This last issue
was found by Roman while testing MULX on targets that define a different latency
for the Low/High part of the result.

Differential Revision: https://reviews.llvm.org/D108727
2021-08-26 12:08:20 +01:00
Alex Richardson b475ce39e8 [sanitizer] Fix build on FreeBSD RISC-V
We have to avoid calling renameat2 and clone on FreeBSD.
Additionally, the mcontext structure has different members.

Reviewed By: jrtc27, luismarques

Differential Revision: https://reviews.llvm.org/D103886
2021-08-26 12:05:37 +01:00
Sindhu Chittireddy de15979bc3 Assert pointer cannot be null; NFC
Klocwork static code analysis exposed this concern:
Pointer 'SubExpr' returned from call to getSubExpr() function which may
return NULL from 'cast_or_null<Expr>(Operand)', which will be
dereferenced in the statement following it

Add an assert on SubExpr to make it clear this pointer cannot be null.
2021-08-26 06:58:56 -04:00
Matthew Devereau 9b830c798e [AArch64][SVE] Teach cost model masked gathers/scatters are cheap
Tell the cost model to use the scalable calculation for non-neon fixed vector.
This results in a cheaper cost for fixed-length SVE masked gathers/scatters
allowing the vectorizor to emit them more frequently.
2021-08-26 11:17:47 +01:00
Benjamin Kramer bd7ece4e06 [X86] Don't write to the source directory in test 2021-08-26 12:11:20 +02:00
Roman Lebedev 564d85e090
The maximal representable alignment in LLVM IR is 1GiB, not 512MiB
In LLVM IR, `AlignmentBitfieldElementT` is 5-bit wide
But that means that the maximal alignment exponent is `(1<<5)-2`,
which is `30`, not `29`. And indeed, alignment of `1073741824`
roundtrips IR serialization-deserialization.

While this doesn't seem all that important, this doubles
the maximal supported alignment from 512MiB to 1GiB,
and there's actually one noticeable use-case for that;
On X86, the huge pages can have sizes of 2MiB and 1GiB (!).

So while this doesn't add support for truly huge alignments,
which i think we can easily-ish do if wanted, i think this adds
zero-cost support for a not-trivially-dismissable case.

I don't believe we need any upgrade infrastructure,
and since we don't explicitly record the IR version,
we don't need to bump one either.

As @craig.topper speculates in D108661#2963519,
this might be an artificial limit imposed by the original implementation
of the `getAlignment()` functions.

Differential Revision: https://reviews.llvm.org/D108661
2021-08-26 12:53:39 +03:00
Benjamin Kramer 5ece556271 [libunwind] Don't include cet.h/immintrin.h unconditionally
These may not exist when CET isn't available.
2021-08-26 11:37:07 +02:00
Alex Richardson 581613413c Make Value::MaxAlignment(Exponent) constexpr
This avoids references to the variables be generated when using e.g. max().

Reviewed By: dexonsmith

Differential Revision: https://reviews.llvm.org/D95050
2021-08-26 10:09:40 +01:00
Alex Richardson 7cab90a7b1 Fix __attribute__((annotate("")) with non-zero globals AS
The existing code attempting to bitcast from a value in the default globals AS
to i8 addrspace(0)* was triggering an assertion failure in our downstream fork.
I found this while compiling poppler for CHERI-RISC-V (we use AS200 for all
globals). The test case uses AMDGPU since that is one of the in-tree targets
with a non-zero default globals address space.
The new test previously triggered a "Invalid constantexpr bitcast!" assertion
and now correctly generates code with addrspace(1) pointers.

Reviewed By: rjmccall

Differential Revision: https://reviews.llvm.org/D105972
2021-08-26 10:09:40 +01:00
Alex Richardson bf66b0eefc Fix LLVM_ENABLE_THREADS check from 26a92d5852
We should be using #if instead of #ifdef here since LLVM_ENABLE_THREADS
is set using #cmakedefine01 so is always defined.

Reviewed By: aaron.ballman

Differential Revision: https://reviews.llvm.org/D108110
2021-08-26 10:09:39 +01:00
Florian Hahn aa5b6c9779
[ConstraintElimination] Initial support for using info from assumes.
This patch adds initial support to use facts from @llvm.assume calls. It
intentionally does not handle all possible cases to keep things simple
initially.

For now, the condition from an assume is made available on entry to the
containing block, if the assume is guaranteed to execute. Otherwise it
is only made available in the successor blocks.
2021-08-26 10:08:00 +01:00
Florian Hahn dd1ec869b0
[ConstraintElimination] Add more assume tests. 2021-08-26 10:07:47 +01:00
David Green 6ffc6951a3 [AArch64] Remove unpredictable from narrowing instructions.
Like other similar instructions the xtn2 family do not have side
effects, and explicitly marking them as such can help improve scheduling
freedom.
2021-08-26 09:43:44 +01:00
David Green 9474b03d41 [AArch64] Add a Cortex-A55 NEON scheduler test case. 2021-08-26 09:43:44 +01:00
Jay Foad 985eb25546 [MachineScheduler] Fix tracing
Consistently print a newline before "RegionInstrs:".
2021-08-26 09:27:01 +01:00
LLVM GN Syncbot 6894552a74 [gn build] Port 21b25a1fb3 2021-08-26 08:14:37 +00:00