Commit Graph

47861 Commits

Author SHA1 Message Date
Pavel Labath dbb158ebf4 Remove top-level using directives from Transforms/IPO headers
These directives pollute the namespace of all files which include the
header.
2022-04-05 11:22:37 +02:00
Muhammad Omair Javaid 0320115c16 Revert "[CodeGen] Async unwind - add a pass to fix CFI information"
This reverts commit 980c3e6dd2.

This commit had failing tests with clang crashing across various
AArch64/Linux buildots.

https://lab.llvm.org/buildbot/#/builders/179/builds/3346

Differential Revision: https://reviews.llvm.org/D114545
2022-04-05 13:12:30 +05:00
Ilia Diachkov 28a681316f Fix nulltpr typo in comment. NFC
The patch fixes the typo "nulltpr", accidentally found in comments.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D122993
2022-04-04 09:05:06 -07:00
Momchil Velikov 980c3e6dd2 [CodeGen] Async unwind - add a pass to fix CFI information
This pass inserts the necessary CFI instructions to compensate for the
inconsistency of the call-frame information caused by linear (non-CFG
aware) nature of the unwind tables.

Unlike the `CFIInstrInserer` pass, this one almost always emits only
`.cfi_remember_state`/`.cfi_restore_state`, which results in smaller
unwind tables and also transparently handles custom unwind info
extensions like CFA offset adjustement and save locations of SVE
registers.

This pass takes advantage of the constraints that LLVM imposes on the
placement of save/restore points (cf. `ShrinkWrap.cpp`):

  * there is a single basic block, containing the function prologue

  * possibly multiple epilogue blocks, where each epilogue block is
    complete and self-contained, i.e. CSR restore instructions (and the
    corresponding CFI instructions are not split across two or more
    blocks.

  * prologue and epilogue blocks are outside of any loops

Thus, during execution, at the beginning and at the end of each basic
block the function can be in one of two states:

  - "has a call frame", if the function has executed the prologue, or
     has not executed any epilogue

  - "does not have a call frame", if the function has not executed the
    prologue, or has executed an epilogue

These properties can be computed for each basic block by a single RPO
traversal.

In order to accommodate backends which do not generate unwind info in
epilogues we compute an additional property "strong no call frame on
entry" which is set for the entry point of the function and for every
block reachable from the entry along a path that does not execute the
prologue. If this property holds, it takes precedence over the "has a
call frame" property.

From the point of view of the unwind tables, the "has/does not have
call frame" state at beginning of each block is determined by the
state at the end of the previous block, in layout order.

Where these states differ, we insert compensating CFI instructions,
which come in two flavours:

- CFI instructions, which reset the unwind table state to the
    initial one.  This is done by a target specific hook and is
    expected to be trivial to implement, for example it could be:
```
     .cfi_def_cfa <sp>, 0
     .cfi_same_value <rN>
     .cfi_same_value <rN-1>
     ...
```
where `<rN>` are the callee-saved registers.

- CFI instructions, which reset the unwind table state to the one
    created by the function prologue. These are the sequence:
```
       .cfi_restore_state
       .cfi_remember_state
```
In this case we also insert a `.cfi_remember_state` after the
last CFI instruction in the function prologue.

Reviewed By: MaskRay, danielkiss, chill

Differential Revision: https://reviews.llvm.org/D114545
2022-04-04 14:38:22 +01:00
Nathan Sidwell ee6ec9e861 [demangler] Parenthesize >> inside template args
Both > and >> expressions need to be parenthesized inside template
argument lists.

Reviewed By: dblaikie, rjmccall

Differential Revision: https://reviews.llvm.org/D122474
2022-04-04 06:35:32 -07:00
Nikita Popov c0cc98251a [Float2Int] Make sure dependent ranges are calculated first (PR54669)
The range calculation in walkForwards() assumes that the ranges of
the operands have already been calculated. With the used visit
order, this is not necessarily the case when there are multiple
roots. (There is nothing guaranteeing that instructions are visited
in topological order.)

Fix this by queuing instructions for reprocessing if the operand
ranges haven't been calculated yet.

Fixes https://github.com/llvm/llvm-project/issues/54669.

Differential Revision: https://reviews.llvm.org/D122817
2022-04-04 10:18:39 +02:00
Augie Fackler e90bce8f91 CallBase: fix getFnAttr so it also checks the function
Prior to this change, CallBase::hasFnAttr checked the called function to
see if it had an attribute if it wasn't set on the CallBase, but
getFnAttr didn't do the same delegation, which led to very confusing
behavior. This patch fixes the issue by making CallBase::getFnAttr also
check the function under the same circumstances.

Test changes look (to me) like they're cleaning up redundant attributes
which no longer get specified both on the callee and call. We also clean
up the one ad-hoc implementation of this getter over in InlineCost.cpp.

Differential Revision: https://reviews.llvm.org/D122821
2022-04-03 23:19:23 -04:00
Philip Reames 7c51669c21 [memcpyopt] Restructure store(load src, dest) form of callslotopt for compile time
The search for the clobbering call is fairly expensive if uses are not optimized at construction.  Defer the clobber walk to the point in the implementation we need it; there are a bunch of bailouts before that point.  (e.g. If the source pointer is not an alloca, we can't do callslotopt.)

On a test case which involves a bunch of copies from argument pointers, this switches memcpyopt from > 1/2 second to < 10ms.
2022-04-03 20:16:20 -07:00
Simon Pilgrim 76cd11f303 [DAG] Add llvm::isMinSignedConstant helper. NFC
Pulled out of D122754
2022-04-01 17:47:34 +01:00
Nathan Sidwell abffdd8876 [demangler] Fix node matchers
* Add instantiation tests to ItaniumDemangleTest, to make sure all
  match functions provide constructor arguments to the provided functor.

* Fix the Node constructors that lost const qualification on arguments.

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D122665
2022-04-01 05:19:34 -07:00
Nathan Sidwell 369337e3c2 [demangler][NFC] Use def file for node names
In order to add a unit test, we need to expose the node names beyond
ItaniumDemangle.h.  This breaks them out into a def file.

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D122739
2022-04-01 05:03:34 -07:00
Peixin-Qiao 3e7415a0ff [OMPIRBuilder] Support ordered clause specified without parameter
This patch supports ordered clause specified without parameter in
worksharing-loop directive in the OpenMPIRBuilder and lowering MLIR to
LLVM IR.

Reviewed By: Meinersbur

Differential Revision: https://reviews.llvm.org/D114940
2022-04-01 16:17:29 +08:00
Jake Vossen 4b82bb6d82 Fix Typo in SmallVector doc
Replace forward slash with backward slash.

Reviewed By: kiranchandramohan

Differential Revision: https://reviews.llvm.org/D120930
2022-04-01 03:44:37 +00:00
yanming a7c0b7504c [VP] Add more cast VPintrinsic and docs.
Add vp.fptoui, vp.uitofp, vp.fptrunc, vp.fpext, vp.trunc, vp.zext, vp.sext, vp.ptrtoint, vp.inttoptr intrinsic and docs.

Reviewed By: frasercrmck, craig.topper

Differential Revision: https://reviews.llvm.org/D122291
2022-04-01 09:16:10 +08:00
Jorge Gorbe Moya fc7573f29c Revert "[misexpect] Re-implement MisExpect Diagnostics"
This reverts commit 46774df307.
2022-03-31 14:54:41 -07:00
Luboš Luňák de4bcdc2ba include stddef.h for size_t
Needed at least for -DLLVM_ENABLE_MODULES=On.
2022-03-31 23:51:07 +02:00
Paul Kirth 46774df307 [misexpect] Re-implement MisExpect Diagnostics
Reimplements MisExpect diagnostics from D66324 to reconstruct its
original checking methodology only using MD_prof branch_weights
metadata.

New checks rely on 2 invariants:

1) For frontend instrumentation, MD_prof branch_weights will always be
   populated before llvm.expect intrinsics are lowered.

2) for IR and sample profiling, llvm.expect intrinsics will always be
   lowered before branch_weights are populated from the IR profiles.

These invariants allow the checking to assume how the existing branch
weights are populated depending on the profiling method used, and emit
the correct diagnostics. If these invariants are ever invalidated, the
MisExpect related checks would need to be updated, potentially by
re-introducing MD_misexpect metadata, and ensuring it always will be
transformed the same way as branch_weights in other optimization passes.

Frontend based profiling is now enabled without using LLVM Args, by
introducing a new CodeGen option, and checking if the -Wmisexpect flag
has been passed on the command line.

Reviewed By: tejohnson

Differential Revision: https://reviews.llvm.org/D115907
2022-03-31 17:38:21 +00:00
Wenju He 0bda12b5bc [NewPM] Add OptimizerEarly module extension point
VectorizerStart extension is module callback in old PM, but is function
callback in new PM. We lack a module extension point between end of
buildModuleSimplificationPipeline and the function optimization
(including vectorizer) pipeline. So this patch adds a new module
extension point before the function optimization pipeline.

Reviewed By: aeubanks

Differential Revision: https://reviews.llvm.org/D122296
2022-03-31 08:22:27 -07:00
Nikita Popov f66975555f [Float2Int] Extract calcRange() method (NFC)
This avoids the awkward "Abort" flag, because we can simply
early-return instead.
2022-03-31 16:13:13 +02:00
Jay Foad aa4c055e25 [AMDGPU] Document the intended semantics of llvm.amdgcn.s.buffer.load
Differential Revision: https://reviews.llvm.org/D122653
2022-03-31 09:13:30 +01:00
Serge Pavlov 881350a92d Mapping of FP operations to constrained intrinsics
A new function 'getConstrainedIntrinsic' is added, which for any gived
instruction returns id of the corresponding constrained intrinsic. If
there is no constrained counterpart for the instruction or the instruction
is already a constrained intrinsic, the function returns zero.

This is recommit of 115b3ace36, reverted in
8160dd582b.

Differential Revision: https://reviews.llvm.org/D69562
2022-03-31 11:07:47 +07:00
David Blaikie 6f5ecd089f Demangle: Fix crash-on-invalid demangling of a module name with no underlying entity 2022-03-30 20:26:32 +00:00
Amir Ayupov c31af7cfe3 [MC][BOLT] Add setter for AllowAtInName
Use the setter in BOLT to allow printing names with variant kind in the name
(e.g. "func@PLT").
Fixes BOLT buildbot tests that broke after D122516:
https://lab.llvm.org/buildbot/#/builders/215/builds/3595

Reviewed By: maksfb

Differential Revision: https://reviews.llvm.org/D122694
2022-03-30 13:04:28 -07:00
Eli Friedman 72517e27c1 [AArch64] Fix AArch64TargetParser.def to match AArch64.td.
Currently, we have two different lists of features each CPU supports...
and those lists aren't consistent. This patch assumes AArch64.td is
right, and tries to fix AArch64TargetParser to match.

It's hard to find documentation for the right features, but reviewers
have confirmed these changes.

Probably we should try to unify the two lists at some point, but
synchronizing them seems like a prerequisite to that anyway.

Differential Revision: https://reviews.llvm.org/D122274
2022-03-30 12:15:39 -07:00
Ben Barham 3fda0edc51 [VFS] RedirectingFileSystem only replace path if not already mapped
If the `ExternalFS` has already remapped a path then the
`RedirectingFileSystem` should not change it to the originally provided
path. This fixes the original path always being used if multiple VFS
overlays were provided and the path wasn't found in the highest (ie.
first in the chain).

This also renames `IsVFSMapped` to `ExposesExternalVFSPath` and only
sets it if `UseExternalName` is true. This flag then represents that the
`Status` has an external path that's different from its virtual path.
Right now the contained path is still the external path, but further PRs
will change this to *always* be the virtual path. Clients that need the
external can then request it specifically.

Note that even though `ExposesExternalVFSPath` isn't set for all
VFS-mapped paths, `IsVFSMapped` was only being used by a hack in
`FileManager` that was specific to module searching. In that case
`UseExternalNames` is always `true` and so that hack still applies.

Resolves rdar://90578880 and llvm-project#53306.

Differential Revision: https://reviews.llvm.org/D122549
2022-03-30 11:52:41 -07:00
Fraser Cormack 73244e8f85 [VP] Add vp.icmp comparison intrinsic and docs
This patch mostly follows up on D121292 which introduced the vp.fcmp
intrinsic.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D122729
2022-03-30 17:05:11 +01:00
Sanjay Patel 436b875e49 [SDAG] avoid libcalls to fmin/fmax for soft-float targets
This is an extension of D70965 to avoid creating a mathlib
call where it did not exist in the original source. Also see
D70852 for discussion about an alternative proposal that was
abandoned.

In the motivating bug report:
https://github.com/llvm/llvm-project/issues/54554
...we also have a more general issue about handling "no-builtin" options.

Differential Revision: https://reviews.llvm.org/D122610
2022-03-30 11:22:03 -04:00
Fraser Cormack da6131f20a [VP] Add vp.fcmp comparison intrinsic and docs
This patch adds the first support for vector-predicated comparison
intrinsics, starting with vp.fcmp. It uses metadata to encode its
condition code, like the llvm.experimental.constrained.fcmp intrinsic.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D121292
2022-03-30 14:39:18 +01:00
Serge Pavlov 8160dd582b Revert "Mapping of FP operations to constrained intrinsics"
This reverts commit 115b3ace36.
Starting from this commit the buildbot sanitizer-x86_64-linux-bootstrap-msan
starts failing (build 10071). Reverted for investigation.
2022-03-30 16:46:43 +07:00
Vitaly Buka 15972e37ba [CodeGen] Avoid access after runtime
Insts must be destroyd before xParent
or it can read it with stack like this:
   0 in llvm::MachineInstr::getMF() const MachineInstr.cpp:637:3
   1 in getMF MachineInstr.h:302:50
   2 in removeNodeFromList MachineBasicBlock.cpp:163:32
2022-03-30 02:08:13 -07:00
Luboš Luňák a60e09509c add missing include for -DLLVM_ENABLE_MODULES=On 2022-03-30 10:31:59 +02:00
Serge Pavlov 115b3ace36 Mapping of FP operations to constrained intrinsics
A new function 'getConstrainedIntrinsic' is added, which for any gived
instruction returns id of the corresponding constrained intrinsic. If
there is no constrained counterpart for the instruction or the instruction
is already a constrained intrinsic, the function returns zero.

Differential Revision: https://reviews.llvm.org/D69562
2022-03-30 12:21:30 +07:00
Zakk Chen 10b2760da0 Revert "[RISCV] Add policy operand for masked compare and vmsbf/vmsif/vmsof IR"
This reverts commit 10fd2822b7.

I have a better implementation for those operations without the
additional policy operand.
masked compare and vmsbf/vmsif/vmsof are always tail agnostic so we could
assume undef maskedoff is mask agnostic.

Differential Revision: https://reviews.llvm.org/D122455
2022-03-29 18:05:33 -07:00
Chris Bieneman 9130e471fe Add DXContainer
DXIL is wrapped in a container format defined by the DirectX 11
specification. Codebases differ in calling this format either DXBC or
DXILContainer.

Since eventually we want to add support for DXBC as a target
architecture and the format is used by DXBC and DXIL, I've termed it
DXContainer here.

Most of the changes in this patch are just adding cases to switch
statements to address warnings.

Reviewed By: pete

Differential Revision: https://reviews.llvm.org/D122062
2022-03-29 14:34:23 -05:00
Chris Bieneman b39f437757 [ADT] add initializer list specialization for is_contained
Adding an initializer list specialization for is_contained allows for
compile-time evaluation when called with a constant or runtime
evaluation for non-constant values.

This patch doesn't add any uses of this template, but that is coming in
a subsequent patch.

Reviewed By: pete

Differential Revision: https://reviews.llvm.org/D122079
2022-03-29 12:39:39 -05:00
Chris Bieneman 5b6207f3cd [ADT] Flesh out HLSL raytracing environments
Fleshing this out now allows me to rely on enum math to translate
values rather than having to translate the off cases.

I should have added this in the first pass, but wasn't thinking about
it.
2022-03-29 09:43:03 -05:00
Nathan Sidwell c204cee642 [demangler] Update node match calls
Each demangler node's match function needs to call the provided
functor with constructor arguments.  That was omitted from D120905.
This adds the new Precedence argument where necessary (and a missing
boolean for a module node).

The two visitors need updating with a printer for that type, and this
adds a stub to cxa_demangle's version.  blaikie added one to llvm's.
I'll fill out those printers in a followup, rather than wait, so that
downstream consumers are unbroken.
2022-03-29 05:32:36 -07:00
Thomas Preud'homme f1d8e46258 Clarify invariants of software pipelining hooks
PowerPC backend relies on each pair of prologue/epilogue of a software
pipelined loop to correspond to a single iteration a the loop through
its use of the BDZ instruction to skip inner prologues/epilogues and
loop kernel. However the interface does not make it clear that it is a
valid way to check that the trip count is big enough to execute inner
prologues/epilogues and kernel loop.

The API also does not specify in which order of prologues the
createTripCountGreaterCondition() hook is being called. Knowing that it
starts with the last/innermost prologues can help recording some
information when createTripCountGreaterCondition() is first executed and
reuse it in setPreheader() or adjustTripCount().

This commit documents both aspects.

Reviewed By: jmolloy

Differential Revision: https://reviews.llvm.org/D122642
2022-03-29 11:44:10 +01:00
serge-sans-paille 01be9be2f2 Cleanup includes: final pass
Cleanup a few extra files, this closes the work on libLLVM dependencies on my
side.

Impact on libLLVM preprocessed output: -35876 lines

Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D122576
2022-03-29 09:00:21 +02:00
Paul Kirth 90cb325abd Revert "[misexpect] Re-implement MisExpect Diagnostics"
This reverts commit 2add3fbd97.
2022-03-29 06:20:30 +00:00
Johannes Doerfert 7df2eba7fa [Attributor][OpenMP] Add assumption for non-call assembly instructions
Inline assembly is scary but we need to support it for the OpenMP GPU
device runtime. The new assumption expresses the fact that it may not
have call semantics, that is, it will not call another function but
simply perform an operation or side-effect. This is important for
reachability in the presence of inline assembly.

Differential Revision: https://reviews.llvm.org/D109986
2022-03-28 20:57:52 -05:00
Johannes Doerfert bb0b23174e [InstCombineCalls] Optimize call of bitcast even w/ parameter attributes
Before we gave up if a call through bitcast had parameter attributes.
Interestingly, we allowed attributes for the return value already. We
now handle both the same way, namely, we drop the ones that are
incompatible with the new type and keep the rest. This cannot cause
"more UB" than initially present.

Differential Revision: https://reviews.llvm.org/D119967
2022-03-28 20:57:52 -05:00
Shao-Ce SUN 662b9fa02c [NFC][CodeGen] Add a setTargetDAGCombine use ArrayRef
Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D122557
2022-03-29 09:53:24 +08:00
Paul Kirth 2add3fbd97 [misexpect] Re-implement MisExpect Diagnostics
Reimplements MisExpect diagnostics from D66324 to reconstruct its
original checking methodology only using MD_prof branch_weights
metadata.

New checks rely on 2 invariants:

1) For frontend instrumentation, MD_prof branch_weights will always be
   populated before llvm.expect intrinsics are lowered.

2) for IR and sample profiling, llvm.expect intrinsics will always be
   lowered before branch_weights are populated from the IR profiles.

These invariants allow the checking to assume how the existing branch
weights are populated depending on the profiling method used, and emit
the correct diagnostics. If these invariants are ever invalidated, the
MisExpect related checks would need to be updated, potentially by
re-introducing MD_misexpect metadata, and ensuring it always will be
transformed the same way as branch_weights in other optimization passes.

Frontend based profiling is now enabled without using LLVM Args, by
introducing a new CodeGen option, and checking if the -Wmisexpect flag
has been passed on the command line.

Reviewed By: tejohnson

Differential Revision: https://reviews.llvm.org/D115907
2022-03-28 23:30:04 +00:00
David Blaikie 1d1cf9b6c4 ItaniumDemangler: Update BinaryExpr::match to match the ctor
Not sure if this could use more testing, but hopefully this is adequate.
2022-03-28 21:51:27 +00:00
zhijian 39772da5fd [AIX][XCOFF] address post-commit review comments of patch https://reviews.llvm.org/D82549
Summary:
Address post-commit review comments in the https://reviews.llvm.org/D82549, including

changed file name from llvm/test/tools/llvm-readobj/XCOFF/xcoff-auxiliary-header.test --> llvm/test/tools/llvm-readobj/XCOFF/auxiliary-header.test
replaced macro define by using lambda function.
added a helper function to reduce the duplicated check and print error code.

Reviewer : James Henderson
Differential Revision: https://reviews.llvm.org/D116220
2022-03-28 15:05:41 -04:00
Nathan Sidwell 1066e397fa [demangler] Add StringView conversion operator
The OutputBuffer class tries to present a NUL-terminated string API to
consumers.  But several of them would prefer a StringView.  In
particular the Microsoft demangler, juggles between NUL-terminated and
StringView, which is confusing.

This adds a StringView conversion, and adjusts the Demanglers that can
benefit from that.

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D120990
2022-03-28 11:19:55 -07:00
Jyotsna Verma 65a2f6ad9c [Hexagon] Create an intrinsic to profile using a custom handler
The intrinsic is lowered into a hexagon pseudo instruction which
after register allocation is expanded into A2_tfrsi and J2_call.
2022-03-28 10:31:41 -05:00
Nathan Sidwell b3b4113a23 [demangler] Add operator precedence
The demangler had no concept of operator precendence, and would
parenthesize many more subexpressions than necessary.  In particular
it would parenthesize primary-expressions, such as '4', which just
looks strange.  It would also parenthesize '>' expressions, just in
case they were inside a template parameter list.

This patch fixes both issues.

* Add operator precedence to the OpInfo structure, and add a
  subexpression helper that will parenthesize a lower precedence
  subexpression.

* Add a 'greater-than is greater-than' indicator to the output buffer,
  so the expression printer knows whether it is immediately inside a
  template parameter list (and must therefore parenthesize 'expr >
  expr').  This is a counter, so that ...

* Add open and close printers to the output buffer, that increment and
  decrement the gt-is-gt indicator.

* Parenthesize comma operators inside comma-separated lists. (probably
  a rare case, but still).

This dramatically reduces the extraneous parentheses being printed.

Reviewed By: dblaikie, bruno

Differential Revision: https://reviews.llvm.org/D120905
2022-03-28 06:17:57 -07:00
Pavel Labath ec6d621050 Remove a top-level using-directive from EPCDebugObjectRegistrar.h
The directive pollutes the namespace of all files which include the
header.

Use alternate ways to reference the namespace constituents instead.
2022-03-28 15:14:20 +02:00
Alexandros Lamprineas 8045bf9d0d [FuncSpec] Support function specialization across multiple arguments.
The current implementation of Function Specialization does not allow
specializing more than one arguments per function call, which is a
limitation I am lifting with this patch.

My main challenge was to choose the most suitable ADT for storing the
specializations. We need an associative container for binding all the
actual arguments of a specialization to the function call. We also
need a consistent iteration order across executions. Lastly we want
to be able to sort the entries by Gain and reject the least profitable
ones.

MapVector fits the bill but not quite; erasing elements is expensive
and using stable_sort messes up the indices to the underlying vector.
I am therefore using the underlying vector directly after calculating
the Gain.

Differential Revision: https://reviews.llvm.org/D119880
2022-03-28 12:01:53 +01:00
Fangrui Song c0eb9b4cde Revert D121984 "[RISCV][NFC] Moving RVV intrinsic type related util to llvm/Support"
This reverts commit ad57e10dbc and 1967fd8d5e

llvm/lib/Support/RISCVVIntrinsicUtils.cpp introduced llvm/TableGen includes,
a circular dependency https://llvm.org/docs/CodingStandards.html#library-layering
I think this particular instance is serious and should be reverted.
2022-03-28 01:17:37 -07:00
Fangrui Song 1967fd8d5e [RISCV] Remove using namespace llvm from public header after D121984 2022-03-28 00:51:58 -07:00
Kito Cheng ad57e10dbc [RISCV][NFC] Moving RVV intrinsic type related util to llvm/Support
This patch is split from https://reviews.llvm.org/D111617, we need those
stuffs on clang, so must moving those stuff to llvm/Support.

Reviewed By: khchen

Differential Revision: https://reviews.llvm.org/D121984
2022-03-28 14:35:28 +08:00
Luo, Yuanke 321cbf75be [Verifier] Verify parameter alignment.
In DAGISel, the parameter alignment only have 4 bits to hold the value.
The encode(alignment) would plus the shift value by 1, so the max aligment
ISel can support is 2^14. This patch verify the parameter and return
value for alignment.

Differential Revision: https://reviews.llvm.org/D121898
2022-03-27 08:35:05 +08:00
Fangrui Song 02f20a09c3 [Option] Remove the error-prone default argument true from 4-argument hasFlag 2022-03-26 01:09:18 -07:00
Fangrui Song 522712e2d2 [Option] Remove the error-prone default argument true from 3-argument hasFlag 2022-03-26 00:58:39 -07:00
Adrian Prantl 1f98e09bf8 Add missing include diagnosed in modules build. (NFC) 2022-03-25 12:40:08 -07:00
Dávid Bolvanský 39d348c602 [NFCI] Fix set-but-unused warning in DenseMap.h in some configurations 2022-03-25 17:12:53 +01:00
Johannes Doerfert a81fff8afd Reapply "[Intrinsics] Add `nocallback` to the default intrinsic attributes"
This reverts commit c5f789050d and
reapplies 7aea3ea8c3 with additional test
changes.
2022-03-25 09:36:50 -05:00
Carlos Alberto Enciso 75112133b8 [llvm-pdbutil] Move InputFile/FormatUtil/LinePrinter to PDB library.
At Sony we are developing llvm-dva

https://lists.llvm.org/pipermail/llvm-dev/2020-August/144174.html

For its PDB support, it requires functionality already present in
llvm-pdbutil.

We intend to move that functionaly into the PDB library to be
shared by both tools. That change will be done in 2 steps, that
will be submitted as 2 patches:

(1) Replace 'ExitOnError' with explicit error handling.
(2) Move the intended shared code to the PDB library.

Patch for step (1): https://reviews.llvm.org/D121801

This patch is for step (2).

Move InputFile.cpp[h], FormatUtil.cpp[h] and LinePrinter.cpp[h]
files to the debug PDB library.

It exposes the following functionality that can be used by tools:

- Open a PDB file.
- Get module debug stream.
- Traverse module sections.
- Traverse module subsections.

Most of the needed functionality is in InputFile, but there are
dependencies from LinePrinter and FormatUtil.

Some other functionality is in the following functions in
DumpOutputStyle.cpp file:

- iterateModuleSubsections
- getModuleDebugStream
- iterateOneModule
- iterateSymbolGroups
- iterateModuleSubsections

Only these specific functions from DumpOutputStyle are moved to
the PDB library.

Reviewed By: aganea, dblaikie, rnk

Differential Revision: https://reviews.llvm.org/D122226
2022-03-25 07:12:58 +00:00
Stanislav Mekhanoshin 6e3e14f600 [AMDGPU] Support gfx940 smfmac instructions
Differential Revision: https://reviews.llvm.org/D122191
2022-03-24 12:40:42 -07:00
Stanislav Mekhanoshin 27439a7642 [AMDGPU] New gfx940 mfma instructions
Differential Revision: https://reviews.llvm.org/D122044
2022-03-24 12:12:52 -07:00
Johannes Doerfert c5f789050d Revert "[Intrinsics] Add `nocallback` to the default intrinsic attributes"
This reverts commit 7aea3ea8c3 as it
breaks the buildbots.

I didn't see these failures in the pre-merge checks, looking into it.
2022-03-24 14:04:41 -05:00
Johannes Doerfert 7aea3ea8c3 [Intrinsics] Add `nocallback` to the default intrinsic attributes
Most intrinsics, especially "default" ones, will not call back into the
IR module. `nocallback` encodes this nicely. As it was not used before,
this patch also makes use of `nocallback` in the Attributor which
results in many more `norecurse` deductions.

Tablegen part is mechanical, test updates by script.

Differential Revision: https://reviews.llvm.org/D118680
2022-03-24 13:50:54 -05:00
Argyrios Kyrtzidis 7f05aa2d4c [Support/BLAKE3] LLVM-specific changes over the original BLAKE3 C implementation
Changes from original BLAKE3 sources:

* `blake.h`:
    * Changes to avoid conflicts if a client also links with its own BLAKE3 version:
        * Renamed the header macro guard with `LLVM_C_` prefix
        * Renamed the C symbols to add the `llvm_` prefix
    * Added a top header comment that references the CC0 license and points to the `LICENSE` file in the repo.
* `blake3_impl.h`: Added `#define`s to remove some of `llvm_` prefixes for the rest of the internal implementation.
* Implementation files:
    * Added a top header comment for `blake.c`
    * Used `llvm_` prefix for the C public API functions
    * Used `LLVM_LIBRARY_VISIBILITY` for internal implementation functions
    * Added `.private_extern`/`.hidden` in assembly files to reduce visibility of the internal implementation functions
* `README.md`:
    * added a note about where the sources originated from
    * Used the C++ BLAKE3 class and `llvm_` prefixed C API in place of examples and API documentation.
    * Removed instructions about how to build the files.
2022-03-24 10:26:39 -07:00
Argyrios Kyrtzidis 9aa701984d [Support] Introduce the BLAKE3 hashing function implementation
BLAKE3 is a cryptographic hash function that is secure and very performant.
The C implementation originates from https://github.com/BLAKE3-team/BLAKE3/tree/1.3.1/c
License is at https://github.com/BLAKE3-team/BLAKE3/blob/1.3.1/LICENSE

This patch adds:

* `llvm/include/llvm-c/blake3.h`: The BLAKE3 C API
* `llvm/include/llvm/Support/BLAKE3.h`: C++ wrapper of the C API
* `llvm/lib/Support/BLAKE3`: Directory containing the BLAKE3 C implementation files, including the `LICENSE` file
* `llvm/unittests/Support/BLAKE3Test.cpp`: unit tests for the BLAKE3 C++ wrapper

This initial patch contains the pristine BLAKE3 sources, a follow-up patch will introduce
LLVM-specific prefixes to avoid conflicts if a client also links with its own BLAKE3 version.

And here's some timings comparing BLAKE3 with LLVM's SHA1/SHA256/MD5.
Timings include `AVX512`, `AVX2`, `neon`, and the generic/portable implementations.
The table shows the speed-up multiplier of BLAKE3 for hashing 100 MBs:

|        Processor        | SHA1  | SHA256 |  MD5 |
|-------------------------|-------|--------|------|
| Intel Xeon W (AVX512)   | 10.4x |   27x  | 9.4x |
| Intel Xeon W (AVX2)     | 6.5x  |   17x  | 5.9x |
| Intel Xeon W (portable) | 1.3x  |  3.3x  | 1.1x |
|      M1Pro (neon)       | 2.1x  |  4.7x  | 2.8x |
|      M1Pro (portable)   | 1.1x  |  2.4x  | 1.5x |

Differential Revision: https://reviews.llvm.org/D121510
2022-03-24 10:26:39 -07:00
Shraiysh Vaishay 8722c12c12 [mlir][OpenMP][IRBuilder] Add support for nowait on single construct
This patch adds the nowait parameter to `createSingle` in
OpenMPIRBuilder and handling for IR generation from OpenMP Dialect.

Also added tests for the same.

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D122371
2022-03-24 22:51:52 +05:30
Mike Rice f82ec5532b [OpenMP] Initial parsing/sema for the 'omp target parallel loop' construct
Adds basic parsing/sema/serialization support for the
 #pragma omp target parallel loop directive.

Differential Revision: https://reviews.llvm.org/D122359
2022-03-24 09:19:00 -07:00
Florian Hahn 46432a0088
[VPlan] Add VPWidenPointerInductionRecipe.
This patch moves pointer induction handling from VPWidenPHIRecipe to its
own recipe. In the process, it adds all information required to generate
code for pointer inductions without relying on Legal to access the list
of induction phis.

Alternatively VPWidenPHIRecipe could also take an optional pointer to InductionDescriptor.

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D121615
2022-03-24 14:58:45 +00:00
Antonio Frighetto 6872c8bdc4 [NFC] Mark derived destructors as `override`
Derived destructors can be marked as override, in order to prevent
possible compilation failures of projects depending on those
headers (when compiled with flags -Wall, -Wsuggest-destructor-override,
-Winconsistent-missing-destructor-override).

Differential Revision: https://reviews.llvm.org/D121993
2022-03-24 11:42:47 +01:00
Daniil Kovalev c53cbce45e [CodeGen] Define ABI breaking class members correctly
Non-static class members declared under #ifndef NDEBUG should be declared
under #if LLVM_ENABLE_ABI_BREAKING_CHECKS to make headers library-friendly and
allow cross-linking, as discussed in D120714.

Differential Revision: https://reviews.llvm.org/D121549
2022-03-24 12:42:59 +03:00
Xiang1 Zhang 287dad13ab [InlineAsm] Fix mangle problem when global variable used in inline asm
(Add modifier P for ARR[BaseReg+IndexReg+..])

Reviewed By: skan

Differential Revision: https://reviews.llvm.org/D120887
2022-03-24 09:41:23 +08:00
Xiang1 Zhang 8a6b644c79 [Inline asm] Fix mangle problem when variable used in inline asm.
(Connect InlineAsm Memory Operand with its real value not just name)
Revert 2 history bugfix patch:

Revert "[X86][MS-InlineAsm] Make the constraint *m to be simple place holder"
This patch revert https://reviews.llvm.org/D115225 which mainly
fix problems intrduced by https://reviews.llvm.org/D113096

This reverts commit d7c07f60b3.

Revert "Reland "[X86][MS-InlineAsm] Use exact conditions to recognize MS global variables""
This patch revert https://reviews.llvm.org/D116090 which fix problem
intrduced by https://reviews.llvm.org/D115225

This reverts commit 24c68ea1eb.

Reviewed By: skan

Differential Revision: https://reviews.llvm.org/D120886
2022-03-24 09:41:22 +08:00
Julian Lettner 64902d335c Reland "Lower `@llvm.global_dtors` using `__cxa_atexit` on MachO"
For MachO, lower `@llvm.global_dtors` into `@llvm_global_ctors` with
`__cxa_atexit` calls to avoid emitting the deprecated `__mod_term_func`.

Reuse the existing `WebAssemblyLowerGlobalDtors.cpp` to accomplish this.

Enable fallback to the old behavior via Clang driver flag
(`-fregister-global-dtors-with-atexit`) or llc / code generation flag
(`-lower-global-dtors-via-cxa-atexit`).  This escape hatch will be
removed in the future.

Differential Revision: https://reviews.llvm.org/D121736
2022-03-23 18:36:55 -07:00
Vasileios Porpodas 39aa202aff Recommit "[SLP] Fix lookahead operand reordering for splat loads." attempt 3, fixed assertion crash.
Original review: https://reviews.llvm.org/D121354

This reverts commit e6ead19b77.
2022-03-23 18:32:17 -07:00
Zequan Wu 581dc3c729 Revert "Lower `@llvm.global_dtors` using `__cxa_atexit` on MachO"
This reverts commit 22570bac69.
2022-03-23 16:11:54 -07:00
Hongtao Yu 3f97016857 [llvm-profgen] Decoding pseudo probe for profiled function only.
Complete pseudo probes decoding can result in large memory usage. In practice only a small porting of the decoded probes are used in profile generation. I'm changing the full decoding mode to be decoding for profiled functions only, though we still do a full scan of the .pseudoprobe section due to a missing table-of-content but we don't have to build the in-memory data structure for functions not sampled.

To build the in-memory data structure for profiled functions only, I'm rewriting the previous non-recursive probe decoding logic to be recursive. This is easy to read and maintain.

I also have to change the previous representation of unsymbolized context from probe-based stack to address-based stack since the profiled functions are unknown yet by the time of virtual unwinding. The address-based stack will be converted to probe-based stack after virtual unwinding and on-demand probe decoding.

I'm seeing 20GB memory is saved for one of our internal large service.

Reviewed By: wenlei

Differential Revision: https://reviews.llvm.org/D121643
2022-03-23 14:15:11 -07:00
Arthur Eubanks 9bd66b312c [PassManager][Coroutine] Run passes under -O0 conditionally and run GlobalDCE
CoroSplit lowers various coroutine intrinsics. It's a CGSCC pass and
CGSCC passes don't run on unreachable functions. Normally GlobalDCE will
come along and delete unreachable functions, but we don't run GlobalDCE
under -O0, so an unreachable function with coroutine intrinsics may
never have CoroSplit run on it.

This patch adds GlobalDCE when coroutines intrinsics are present. It
also now runs all coroutine passes conditional when coroutine intrinsics
are present. This should also solve the -O0 regression reported in
D105877 due to LazyCallGraph construction.

Fixes https://github.com/llvm/llvm-project/issues/54117

Reviewed By: ChuanqiXu

Differential Revision: https://reviews.llvm.org/D122275
2022-03-23 11:03:26 -07:00
Arthur Eubanks e6ead19b77 Revert "Recommit "[SLP] Fix lookahead operand reordering for splat loads." attempt 2, fixed assertion crash."
This reverts commit 27bd8f9492.

Causes crashes, see comments in D121973
2022-03-23 10:57:45 -07:00
serge-sans-paille 02c28970b2 Cleanup include: codegen second round
Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D122180
2022-03-23 13:54:00 +01:00
Marcus Johnson d14ccbc2e8 Re-land c346068928 with fixes
It was previously reverted in a6beb18b84
due to test failures.
2022-03-23 08:13:17 -04:00
Craig Topper 681fd2c11e Revert "[SelectionDAG] Don't create entries in ValueMap in ComputePHILiveOutRegInfo"
This reverts commit 1a9b55b63a.

Causing build bot failures
2022-03-22 23:41:47 -07:00
Craig Topper 1a9b55b63a [SelectionDAG] Don't create entries in ValueMap in ComputePHILiveOutRegInfo
Instead of using operator[], use DenseMap::find to prevent default
constructing an entry if it isn't already in the map.
2022-03-22 23:24:53 -07:00
Shengchen Kan b7a4b67380 [Bundle][Codegen] Ignore bundle for meta-instruction
The purpose is to keep the default behavior as before.
Noticed by comments in D121600.

Reviewed By: bjope

Differential Revision: https://reviews.llvm.org/D122221
2022-03-23 10:14:54 +08:00
Vasileios Porpodas 27bd8f9492 Recommit "[SLP] Fix lookahead operand reordering for splat loads." attempt 2, fixed assertion crash.
Original review: https://reviews.llvm.org/D121354

This reverts commit f7d7d2a08d.
2022-03-22 16:41:55 -07:00
Snehasish Kumar 27a4f2545f Reland "[memprof] Store callsite metadata with memprof records."
This reverts commit f4b794427e.

Reland with underlying msan issue fixed in D122260.
2022-03-22 14:40:02 -07:00
Snehasish Kumar 61c75eb637 [memprof] Initialize MemInfoBlock data.
This patch updates the existing default no-arg constructor for
MemInfoBlock to explicitly initialize all members. Also add missing
DataTypeId initialization to the other constructor. These issues were
exposed by msan on patch D121179. With this patch D121179 builds cleanly
on msan.

Reviewed By: tejohnson

Differential Revision: https://reviews.llvm.org/D122260
2022-03-22 14:35:57 -07:00
Mike Rice 2cedaee6f7 [OpenMP] Initial parsing/sema for the 'omp parallel loop' construct
Adds basic parsing/sema/serialization support for the
  #pragma omp parallel loop directive.

 Differential Revision: https://reviews.llvm.org/D122247
2022-03-22 13:55:47 -07:00
Arthur Eubanks f7d7d2a08d Revert "Recommit "[SLP] Fix lookahead operand reordering for splat loads.""
This reverts commit 79613185d3.

Causes crashes, see comments in https://reviews.llvm.org/D121973.
2022-03-22 13:33:49 -07:00
Aaron Ballman a6beb18b84 Revert "Add UTF32 to/from UTF8 conversion functions"
This reverts commit c346068928.

It broke at least one of the builders:
https://lab.llvm.org/buildbot#builders/100/builds/13947
2022-03-22 15:00:40 -04:00
Marcus Johnson c346068928 Add UTF32 to/from UTF8 conversion functions
This is anticipated to be used in new format specifier checking code.
2022-03-22 13:41:43 -04:00
Zakk Chen 23d60ce164 [RISCV][NFC] Refine and refactor RISCVVEmitter and riscv_vector.td.
1. Rename nomask as unmasked to keep with the terminology in the spec.
2. Merge UnMaskpolicy and Maskedpolicy arguments into one in RVVBuiltin class.
3. Rename HasAutoDef as HasBuiltinAlias.
4. Move header definition code into one class.

Reviewed By: rogfer01

Differential Revision: https://reviews.llvm.org/D120870
2022-03-22 09:58:43 -07:00
Craig Topper 49c2206b3b [VP] Preserve address space of pointer for strided load/store intrinsics.
This adds LLVMAnyPointerToElt to use instead of LLVMPointerToElt.
This allows us to preserve the address space as part of the type
overload for the intrinsic, but still require the vector element
type to match the pointer type.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D122042
2022-03-22 09:52:54 -07:00
Nathan Sidwell c354167ae2 [demangler] Add support for C++20 modules
Add support for module name demangling.  We have two new demangler
nodes -- ModuleName and ModuleEntity. The former represents a module
name in a hierarchical fashion. The latter is the combination of a
(name) node and a module name. Because module names and entity
identities use the same substitution encoding, we have to adjust the
flow of how substitutions are handled, and examine the substituted
node to know how to deal with it.

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D119933
2022-03-22 09:42:52 -07:00
Zakk Chen 10fd2822b7 [RISCV] Add policy operand for masked compare and vmsbf/vmsif/vmsof IR
intrinsics.

Those operations are updated under a tail agnostic policy, but they
could have mask agnostic or undisturbed.

Reviewed By: rogfer01

Differential Revision: https://reviews.llvm.org/D120228
2022-03-22 07:47:21 -07:00
Joseph Huber 5856f30b5a [LTO] Add configuartion option to use default optimization pipeline
This patch adds a configuration option to simply use the default pass
pipeline in favor of the LTO-specific one. We observed some severe
performance penalties when uding device-side LTO for OpenMP offloading
applications caused by the LTO-pass pipeline. This is primarily because
OpenMP uses an LLVM bitcode library to implement a GPU runtime library.
In a standard compilation we link this bitcode library into each source
file and optimize it with the default pipeline. When performing LTO we
link it late with all the files, but the bitcode library never has the
regular optimization pipeline applied to it so we miss a few
optimizations just using the LTO pipeline to optimize it.

I'm not committed to this solution, but it's the easiest method to solve
this performance regression when using LTO without changing the
optimizatin pipeline for other users.

Reviewed By: tianshilei1992

Differential Revision: https://reviews.llvm.org/D122133
2022-03-22 09:28:45 -04:00
Djordje Todorovic 73777b4c35 [Debugify] Optimize debugify original mode
Before we start addressing the issue with having
a lot of false positives when using debugify in
the original mode, we have made a few patches that
should speed up the execution of the testing
utility Passes.

For example, when testing a large project
(let's say LLVM project itself), we can face
a lot of potential DI issues. Usually, we use
-verify-each-debuginfo-preserve (that is very
similar to -debugify-each) -- it collects
DI metadata before each Pass, and after the Pass
it checks if the Pass preserved the DI metadata.
However, we can speed up this process, since we
don't need to collect DI metadata before each
Pass -- we could use the DI metadata that are
collected after the previous Pass from
the pipeline as an input for the next Pass.

This patch speeds up the utility for ~2x.

Differential Revision: https://reviews.llvm.org/D115622
2022-03-22 12:14:00 +01:00
Simon Moll 7de383c892 [VP] Fix VPintrinsic::getStaticVectorLength for vp.merge|select
VPIntrinsic::getStaticVectorLength infers the operational vector length
of a VPIntrinsic instance from a type that is used with the intrinsic.
The function used the mask operand before. Yet, vp.merge|select do not
have a mask operand (in the predicating sense that the other VP
intrinsics are using them - it is a selection mask for them). Fallback
to the return type to fix this.

Reviewed By: kaz7

Differential Revision: https://reviews.llvm.org/D121913
2022-03-22 11:41:23 +01:00
Zakk Chen 9ab18cc535 [RISCV] Add policy operand for masked vid and viota IR intrinsics.
Reviewed By: rogfer01

Differential Revision: https://reviews.llvm.org/D120227
2022-03-22 02:32:31 -07:00