Commit Graph

5142 Commits

Author SHA1 Message Date
Cullen Rhodes 0395e01583 [IR] Split vscale_range interface
Interface is split from:

  std::pair<unsigned, unsigned> getVScaleRangeArgs()

into separate functions for min/max:

  unsigned getVScaleRangeMin();
  Optional<unsigned> getVScaleRangeMax();

Reviewed By: sdesmalen, paulwalker-arm

Differential Revision: https://reviews.llvm.org/D114075
2021-12-07 10:38:26 +00:00
Cullen Rhodes 698584f89b [IR] Remove unbounded as possible value for vscale_range minimum
The default for min is changed to 1. The behaviour of -mvscale-{min,max}
in Clang is also changed such that 16 is the max vscale when targeting
SVE and no max is specified.

Reviewed By: sdesmalen, paulwalker-arm

Differential Revision: https://reviews.llvm.org/D113294
2021-12-07 09:52:21 +00:00
Sameer Sahasrabuddhe 0fe61ecc2c CycleInfo: Introduce cycles as a generalization of loops
LLVM loops cannot represent irreducible structures in the CFG. This
change introduce the concept of cycles as a generalization of loops,
along with a CycleInfo analysis that discovers a nested
hierarchy of such cycles. This is based on Havlak (1997), Nesting of
Reducible and Irreducible Loops.

The cycle analysis is implemented as a generic template and then
instatiated for LLVM IR and Machine IR. The template relies on a new
GenericSSAContext template which must be specialized when used for
each IR.

This review is a restart of an older review request:
https://reviews.llvm.org/D83094

Original implementation by Nicolai Hähnle <nicolai.haehnle@amd.com>,
with recent refactoring by Sameer Sahasrabuddhe <sameer.sahasrabuddhe@amd.com>

Differential Revision: https://reviews.llvm.org/D112696
2021-12-07 12:02:34 +05:30
Paulo Matos a96d828510 [WebAssembly] Implementation of intrinsic for ref.null and HeapType removal
This patch implements the intrinsic for ref.null.
In the process of implementing int_wasm_ref_null_func() and
int_wasm_ref_null_extern() intrinsics, it removes the redundant
HeapType.

This also causes the textual assembler syntax for ref.null to
change. Instead of receiving an argument: `func` or `extern`, the
instruction mnemonic is either ref.null_func or ref.null_extern,
without the need for a further operand.

Reviewed By: tlively

Differential Revision: https://reviews.llvm.org/D114979
2021-12-06 09:46:15 +01:00
Nikita Popov 573a9bc4ad [llvm-c] Avoid deprecated APIs in tests
Avoid the use of deprecated (opaque pointer incompatible) APIs
in C API tests, in preparation for header deprecation. Add a
LLVMGetGEPSourceElementType() to cover a bit of functionality
that is necessary for the echo test.

This change is split out from https://reviews.llvm.org/D114936.
2021-12-04 18:58:08 +01:00
Jay Foad c8e84c7a5f [IR,TableGen] Add support for vec3 intrinsic arguments
Add generic support for vec3 types, and in particular define
llvm_v3f32_ty which will be used by AMDGPU's
llvm.amdgcn.image.bvh.intersect.ray intrinsic.

Differential Revision: https://reviews.llvm.org/D114956
2021-12-04 10:32:11 +00:00
Arthur Eubanks 93a20ecee4 [DebugInfo] Check DIEnumerator bit width when comparing for equality
As mentioned in D106585, this causes non-determinism, which can also be
shown by this test case being flaky without this patch.

We were using the APSInt's bit width for hashing, but not for checking
for equality. APInt::isSameValue() does not check bit width.

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D115054
2021-12-03 13:40:22 -08:00
Simon Pilgrim 74cc0fa1db [IR][AutoUpgrade] Merge x86 mask load intrinsic upgrades. NFC.
Helps appease MSVC which is complaining about "fatal error C1061: compiler limit: blocks nested too deeply" - we already do the same thing for avx512.mask.store intrinsics.

This is only a stopgap solution until another else-if case needs adding - we really need to refactor this chain of ifs properly.
2021-12-03 16:53:59 +00:00
David Green 08035000cd [ARM] Separate ARM autoupgrade code into a separate function
Try to appease the microsoft compiler which is apparently running out of
if statements. Separate the new ARM code into a separate function to
keep it simpler.
2021-12-03 16:45:26 +00:00
David Green 11f67f5a2c [ARM] Replace if's with a switch, NFC
I'm not having a lot of luck with the microosft compiler recently. Maybe
this will help it with its errors:
llvm\lib\IR\AutoUpgrade.cpp(3726): fatal error C1061: compiler limit: blocks nested too deeply

If not, it's a good code cleanup anyway.
2021-12-03 16:16:30 +00:00
David Green ab0c5cea0b [ARM] Use v2i1 for MVE and CDE intrinsics
This adjusts all the MVE and CDE intrinsics now that v2i1 is a legal
type, to use a <2 x i1> as opposed to emulating the predicate with a
<4 x i1>. The v4i1 workarounds have been removed leaving the natural
v2i1 types, notably in vctp64 which now generates a v2i1 type.

AutoUpgrade code has been added to upgrade old IR, which needs to
convert the old v4i1 to a v2i1 be converting it back and forth to an
integer with arm.mve.v2i and arm.mve.i2v intrinsics. These should be
optimized away in the final assembly.

Differential Revision: https://reviews.llvm.org/D114455
2021-12-03 15:27:58 +00:00
Kazu Hirata 262dd1e42d [llvm] Use range-based for loops (NFC) 2021-12-02 09:27:47 -08:00
Nikita Popov 55d392cc30 [llvm-c] Make LLVMAddAlias opaque pointer compatible
Deprecate LLVMAddAlias in favor of LLVMAddAlias2, which accepts a
value type and an address space. Previously these were extracted
from the pointer type.

Differential Revision: https://reviews.llvm.org/D114860
2021-12-02 09:21:16 +01:00
Nikita Popov 9687c13174 [Verifier] Make matrix intrinsic verification compatible with opaque pointers
Don't check the pointer element type for opaque pointers.
2021-12-01 16:26:05 +01:00
Ellis Hoag 0150645bf5 [DebugInfo] Do not replace existing nodes from DICompileUnit
When creating a new DIBuilder with an existing DICompileUnit, load the
DINodes from the current DICompileUnit so they don't get overwritten.
This is done in the MachineOutliner pass, but it didn't change the CU so
the bug never appeared. We need this if we ever want to add DINodes to
the CU after it has been created, e.g., DIGlobalVariables.

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D114556
2021-11-29 19:46:10 -08:00
Zarko Todorovski 95875d246a [LLVM][NFC]Inclusive language: remove occurances of sanity check/test from llvm
Part of work to use more inclusive language in clang/llvm. Rewording
some comments and change function and variable names.
2021-11-24 17:29:55 -05:00
Rosie Sumpter 991074012a [LoopVectorize] Propagate fast-math flags for VPInstruction
In-loop vector reductions which use the llvm.fmuladd intrinsic involve
the creation of two recipes; a VPReductionRecipe for the fadd and a
VPInstruction for the fmul. If the call to llvm.fmuladd has fast-math flags
these should be propagated through to the fmul instruction, so an
interface setFastMathFlags has been added to the VPInstruction class to
enable this.

Differential Revision: https://reviews.llvm.org/D113125
2021-11-24 08:50:04 +00:00
Simon Moll 1e65b93f3a [VP] Canonicalize macros of VPIntrinsics.def
Usage and naming of macros in VPIntrinsics.def has been inconsistent. Rename all property macros to VP_PROPERTY_<name>.  Use BEGIN/END scope macros to attach properties to vp intrinsics and SDNodes (instead of specifying either directly with the property macro).
A follow-up patch has documentation on how the macros are (intended) to be used.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D114144
2021-11-23 16:51:11 +01:00
Kazu Hirata 7ca14f6044 [llvm] Use range-based for loops (NFC) 2021-11-18 09:09:52 -08:00
David Sherwood ca18fcc2c0 [IR] Change CreateStepVector to work with element types smaller than i8
Currently the stepvector intrinsic only supports element types that
are integers of size 8 bits or more. This patch adds support for the
creation of stepvectors with smaller element types by creating
the intrinsic with i8 elements that we then truncate to the requested
size.

It's not currently possible to write a vectoriser test to exercise
this code path so I have added a unit test here:

  llvm/unittests/IR/IRBuilderTest.cpp

Differential Revision: https://reviews.llvm.org/D113767
2021-11-17 10:47:50 +00:00
Philip Reames ed6b69a38f Add a hasPoisonGeneratingFlags proxy wrapper to Instruction [NFC]
This just cuts down on casts to Operator.
2021-11-16 08:48:16 -08:00
Kazu Hirata d243cbf8ea [llvm] Use isa instead of dyn_cast (NFC) 2021-11-14 19:40:46 -08:00
Mircea Trofin a32c2c3808 [NFC] Use Optional<ProfileCount> to model invalid counts
ProfileCount could model invalid values, but a user had no indication
that the getCount method could return bogus data. Optional<ProfileCount>
addresses that, because the user must dereference the optional. In
addition, the patch removes concept duplication.

Differential Revision: https://reviews.llvm.org/D113839
2021-11-14 19:03:30 -08:00
Luís Ferreira 665b4138d9 [DebugInfo] run clang-format on some unformatted files
This trivial patch runs clang-format on some unformatted files before
doing logic changes and prevent hard to review diffs.

Differential Revision: https://reviews.llvm.org/D113572
2021-11-11 18:59:41 -08:00
David Sherwood 2a48b6993a [IR] In ConstantFoldShuffleVectorInstruction use zeroinitializer for splats of 0
When creating a splat of 0 for scalable vectors we tend to create them
with using a combination of shufflevector and insertelement, i.e.

shufflevector (<vscale x 4 x i32> insertelement (<vscale x 4 x i32> poison, i32 0, i32 0),
               <vscale x 4 x i32> poison, <vscale x 4 x i32> zeroinitializer)

However, for the case of a zero splat we can actually just replace the
above with zeroinitializer instead. This makes the IR a lot simpler and
easier to read. I have changed ConstantFoldShuffleVectorInstruction to
use zeroinitializer when creating a splat of integer 0 or FP +0.0 values.

Differential Revision: https://reviews.llvm.org/D113394
2021-11-10 09:42:58 +00:00
Joseph Huber b8a825b483 [Attributor] Introduce AAAssumptionInfo to propagate assumptions
This patch introduces a new abstract attributor instance that propagates
assumption information from functions. Conceptually, if a function is
only called by functions that have certain assumptions, then we can
apply the same assumptions to that function. This problem is similar to
calculating the dominator set, but the assumptions are merged instead of
nodes.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D111054
2021-11-09 17:39:18 -05:00
Arthur Eubanks 05963a3d66 Revert "[DebugInfo] Enforce implicit constraints on `distinct` MDNodes"
This reverts commit ee76525698.

Causes crashes, see comments in D104827.
2021-11-09 14:27:55 -08:00
Scott Linder ee76525698 [DebugInfo] Enforce implicit constraints on `distinct` MDNodes
Add UNIQUED and DISTINCT properties in Metadata.def and use them to
implement restrictions on the `distinct` property of MDNodes:

* DIExpression can currently be parsed from IR or read from bitcode
  as `distinct`, but this property is silently dropped when printing
  to IR. This causes accepted IR to fail to round-trip. As DIExpression
  appears inline at each use in the canonical form of IR, it cannot
  actually be `distinct` anyway, as there is no syntax to describe it.
* Similarly, DIArgList is conceptually always uniqued. It is currently
  restricted to only appearing in contexts where there is no syntax for
  `distinct`, but for consistency it is treated equivalently to
  DIExpression in this patch.
* DICompileUnit is already restricted to always being `distinct`, but
  along with adding general support for the inverse restriction I went
  ahead and described this in Metadata.def and updated the parser to be
  general. Future nodes which have this restriction can share this
  support.

The new UNIQUED property applies to DIExpression and DIArgList, and
forbids them to be `distinct`. It also implies they are canonically
printed inline at each use, rather than via MDNode ID.

The new DISTINCT property applies to DICompileUnit, and requires it to
be `distinct`.

A potential alternative change is to forbid the non-inline syntax for
DIExpression entirely, as is done with DIArgList implicitly by requiring
it appear in the context of a function. For example, we would forbid:

    !named = !{!0}
    !0 = !DIExpression()

Instead we would only accept the equivalent inlined version:

    !named = !{!DIExpression()}

This essentially removes the ability to create a `distinct` DIExpression
by construction, as there is no syntax for `distinct` inline. If this
patch is accepted as-is, the result would be that the non-canonical
version is accepted, but the following would be an error and produce a diagnostic:

    !named = !{!0}
    ; error: 'distinct' not allowed for !DIExpression()
    !0 = distinct !DIExpression()

Also update some documentation to consistently use the inline syntax for
DIExpression, and to describe the restrictions on `distinct` for nodes
where applicable.

Reviewed By: StephenTozer, t-tye

Differential Revision: https://reviews.llvm.org/D104827
2021-11-09 18:19:11 +00:00
Nikita Popov 2060895c9c [ConstantRange] Add exact union/intersect (NFC)
For some optimizations on comparisons it's necessary that the
union/intersect is exact and not a superset. Add methods that
return Optional<ConstantRange> only if the result is exact.

For the sake of simplicity this is implemented by comparing
the subset and superset approximations for now, but it should be
possible to do this more directly, as unionWith() and intersectWith()
already distinguish the cases where the result is imprecise for the
preferred range type functionality.
2021-11-07 21:46:06 +01:00
Nikita Popov cf71a5ea8f [ConstantRange] Support zero size in isSizeLargerThan()
From an API perspective, it does not make a lot of sense that 0
is not a valid argument to this function. Add the exact check needed
to support it.
2021-11-07 21:22:45 +01:00
Nikita Popov a8c318b50e [BasicAA] Use index size instead of pointer size
When accumulating the GEP offset in BasicAA, we should use the
pointer index size rather than the pointer size.

Differential Revision: https://reviews.llvm.org/D112370
2021-11-07 18:56:11 +01:00
Nikita Popov 9f0194be45 [ConstantRange] Add getEquivalentICmp() variant with offset (NFCI)
Add a variant of getEquivalentICmp() that produces an optional
offset. This allows us to create an equivalent icmp for all ranges.

Use this in the with.overflow folding code, which was doing this
adjustment separately -- this clarifies that the fold will indeed
always apply.
2021-11-06 21:59:45 +01:00
Kazu Hirata 87e53a0ad8 [llvm] Use make_early_inc_range (NFC) 2021-11-05 19:39:07 -07:00
Scott Linder f82bdf0fcc [NFC][Verifier] Remove redundant Module parameters
These `M` parameters shadow the `M` member in `VerifierSupport`, and
both always refer to the same module. Eliminate the redundant parameters
and always use the member.

Reviewed By: dexonsmith

Differential Revision: https://reviews.llvm.org/D106474
2021-11-05 21:30:02 +00:00
Roman Lebedev a5cd27880a
[IR] Improve member `ShuffleVectorInst::isReplicationMask()`
When we have an actual shuffle, we can impose the additional restriction
that the mask replicates the elements of the first operand, so we know
the replication factor as a ratio of output and op0 vector sizes.
2021-11-06 00:09:27 +03:00
Yonghong Song 3466e00716 Reland "[Attr] support btf_type_tag attribute"
This is to revert commit f95bd18b5f (Revert "[Attr] support
btf_type_tag attribute") plus a bug fix.

Previous change failed to handle cases like below:
    $ cat reduced.c
    void a(*);
    void a() {}
    $ clang -c reduced.c -O2 -g

In such cases, during clang IR generation, for function a(),
CGCodeGen has numParams = 1 for FunctionType. But for
FunctionTypeLoc we have FuncTypeLoc.NumParams = 0. By using
FunctionType.numParams as the bound to access FuncTypeLoc
params, a random crash is triggered. The bug fix is to
check against FuncTypeLoc.NumParams before accessing
FuncTypeLoc.getParam(Idx).

Differential Revision: https://reviews.llvm.org/D111199
2021-11-05 11:25:17 -07:00
Roman Lebedev 01d8759ac9
[IR][ShuffleVector] Introduce `isReplicationMask()` matcher
Avid readers of this saga may recall from previous installments,
that replication mask replicates (lol) each of the `VF` elements
in a vector `ReplicationFactor` times. For example, the mask for
`ReplicationFactor=3` and `VF=4` is: `<0,0,0,1,1,1,2,2,2,3,3,3>`.
More importantly, replication mask is used by LoopVectorizer
when using masked interleaved memory operations.

As discussed in previous installments, while it is used by LV,
and we **seem** to support masked interleaved memory operations on X86,
it's support in cost model leaves a lot to be desired:
until basically yesterday even for AVX512 we had no cost model for it.

As it has been witnessed in the recent
AVX2 `X86TTIImpl::getInterleavedMemoryOpCost()`
costmodel patches, while it is hard-enough to query the cost
of a particular assembly sequence [from llvm-mca],
afterwards the check lines LV costmodel tests must be updated manually.
This is, at the very least, boring.

Okay, now we have decent costmodel coverage for interleaving shuffles,
but now basically the same mind-killing sequence has to be performed
for replication mask. I think we can improve at least the second half
of the problem, by teaching
the `TargetTransformInfoImplCRTPBase::getUserCost()` to recognize
`Instruction::ShuffleVector` that are repetition masks,
adding exhaustive test coverage
using `-cost-model -analyze` + `utils/update_analyze_test_checks.py`

This way we can have good exhaustive coverage for cost model,
and only basic coverage for the LV costmodel.

This patch adds precise undef-aware `isReplicationMask()`,
with exhaustive test coverage.
* `InstructionsTest.ShuffleMaskIsReplicationMask` shows that
   it correctly detects all the known masks.
* `InstructionsTest.ShuffleMaskIsReplicationMask_undef`
  shows that replacing some mask elements in a known replication mask
  still allows us to recognize it as a replication mask.
  Note, with enough undef elts, we may detect a different tuple.
* `InstructionsTest.ShuffleMaskIsReplicationMask_Exhaustive_Correctness`
  shows that if we detected the replication mask with given params,
  then if we actually generate a true replication mask with said params,
  it matches element-wise ignoring undef mask elements.

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D113214
2021-11-05 16:53:47 +03:00
Martin Storsjö f95bd18b5f Revert "[Attr] support btf_type_tag attribute"
This reverts commits 737e4216c5 and
ce7ac9e66a.

After those commits, the compiler can crash with a reduced
testcase like this:

$ cat reduced.c
void a(*);
void a() {}
$ clang -c reduced.c -O2 -g
2021-11-05 10:36:40 +02:00
Vitaly Buka 1caabbef8e [OpaquePtr] Fix initialization-order-fiasco
Asan detects it after D112732.
2021-11-04 19:29:06 -07:00
Arthur Eubanks 7175886a0f [NewPM] Make eager analysis invalidation per-adaptor
Follow-up change to D111575.
We don't need eager invalidation on every adaptor. Most notably,
adaptors running passes that use very few analyses, or passes that
purely invalidate specific analyses.

Also allow testing of this via a pipeline string
"function<eager-inv>()".

The compile time/memory impact of this is very comparable to D111575.
https://llvm-compile-time-tracker.com/compare.php?from=9a2eec512a29df45c90c2fcb741e9d5c693b1383&to=b9f20bcdea138060967d95a98eab87ce725b22bb&stat=instructions

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D113196
2021-11-04 17:16:11 -07:00
Yonghong Song 737e4216c5 [Attr] support btf_type_tag attribute
This patch added clang codegen and llvm support
for btf_type_tag support. Currently, btf_type_tag
attribute info is preserved in DebugInfo IR only for
pointer types associated with typedef, global variable
and function declaration. Eventually, such information
is emitted to dwarf.

The following is an example:
  $ cat test.c
  #define __tag __attribute__((btf_type_tag("tag")))
  int __tag *g;
  $ clang -O2 -g -c test.c
  $ llvm-dwarfdump --debug-info test.o
  ...
  0x0000001e:   DW_TAG_variable
                  DW_AT_name      ("g")
                  DW_AT_type      (0x00000033 "int *")
                  DW_AT_external  (true)
                  DW_AT_decl_file ("/home/yhs/test.c")
                  DW_AT_decl_line (2)
                  DW_AT_location  (DW_OP_addr 0x0)

  0x00000033:   DW_TAG_pointer_type
                  DW_AT_type      (0x00000042 "int")

  0x00000038:     DW_TAG_LLVM_annotation
                    DW_AT_name    ("btf_type_tag")
                    DW_AT_const_value     ("tag")

  0x00000041:     NULL

  0x00000042:   DW_TAG_base_type
                  DW_AT_name      ("int")
                  DW_AT_encoding  (DW_ATE_signed)
                  DW_AT_byte_size (0x04)

  0x00000049:   NULL

Basically, a DW_TAG_LLVM_annotation tag will be inserted
under DW_TAG_pointer_type tag if that pointer has a btf_type_tag
associated with it.

Differential Revision: https://reviews.llvm.org/D111199
2021-11-04 14:23:31 -07:00
Erich Keane 09233412ed Revert part of D112349 to allow ifunc resolvers be declarations.
The patch in D112349 added a previously nonexistant restriction on ifunc
resolvers that they MUST be defintions.  However, the function
multiversioning depends on being able to resolve these resolvers at
link-time, so this additional restriction was breaking.
2021-11-03 07:15:16 -07:00
Qiu Chaofan 741aeda97d [PowerPC] Implement longdouble pack/unpack builtins
Implement two builtins to pack/unpack IBM extended long double float,
according to GCC 'Basic PowerPC Builtin Functions Available ISA 2.05'.

Reviewed By: jsji

Differential Revision: https://reviews.llvm.org/D112055
2021-11-03 17:57:25 +08:00
hsmahesha e9ea992496 [IR] Replace *all* uses of a constant expression by corresponding instruction
When a constant expression CE is being converted into a corresponding instruction I,
CE is supposed to be replaced by I. However, it is possible that CE is being used multiple
times within a parent instruction PI. Make sure that *all* the uses of CE within PI are
replaced by I.

Reviewed By: rampitec, arsenm

Differential Revision: https://reviews.llvm.org/D112717
2021-11-02 10:01:46 +05:30
Itay Bookstein 848812a55e [Verifier] Add verification logic for GlobalIFuncs
Verify that the resolver exists, that it is a defined
Function, and that its return type matches the ifunc's
type. Add corresponding check to BitcodeReader, change
clang to emit the correct type, and fix tests to comply.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D112349
2021-10-31 20:00:57 -07:00
Roman Lebedev 03a4f1f3b8
[ConstantRange] Sign-flipping of signedness-invariant comparisons
For certain combination of LHS and RHS constant ranges,
the signedness of the relational comparison predicate is irrelevant.

This implements complete and precise model for all predicates,
as confirmed by the brute-force tests. I'm not sure if there are
some more cases that we can handle here.

In a follow-up, CVP will make use of this.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D90924
2021-10-31 22:53:17 +03:00
Roman Lebedev 25043c8276
[NFCI] Introduce `ICmpInst::compare()` and use it where appropriate
As noted in https://reviews.llvm.org/D90924#inline-1076197
apparently this is a pretty common pattern,
let's not repeat it yet again, but have it in a common place.

There may be some more places where it could be used,
but these are the most obvious ones.
2021-10-30 17:50:06 +03:00
Jay Foad 56f03d25b4 [IR] Remove createReplacementInstr. NFC.
It is unused since D112791.

Differential Revision: https://reviews.llvm.org/D112795
2021-10-29 15:03:19 +01:00
Jay Foad 1b758925ad [IR] Merge createReplacementInstr into ConstantExpr::getAsInstruction
createReplacementInstr was a trivial wrapper around
ConstantExpr::getAsInstruction, which also inserted the new instruction
into a basic block. Implement this directly in getAsInstruction by
adding an InsertBefore parameter and change all callers to use it. NFC.

A follow-up patch will remove createReplacementInstr.

Differential Revision: https://reviews.llvm.org/D112791
2021-10-29 15:02:58 +01:00
Roman Lebedev b291597112
Revert rest of `IRBuilderBase`'s short-circuiting folds
Upon further investigation and discussion,
this is actually the opposite direction from what we should be taking,
and this direction wouldn't solve the motivational problem anyway.

Additionally, some more (polly) tests have escaped being updated.
So, let's just take a step back here.

This reverts commit f3190dedee.
This reverts commit 749581d21f.
This reverts commit f3df87d57e.
This reverts commit ab1dbcecd6.
2021-10-28 02:15:14 +03:00
Nikita Popov ea7be26045 [ConstantRange] Optimize smul_sat() (NFC)
Base the implementation on the APInt smul_sat() implementation,
which is much more efficient than performing calculations in
double the bitwidth.
2021-10-27 21:01:09 +02:00
Philip Reames 425cbbc602 [Operator] Add hasPoisonGeneratingFlags [mostly NFC]
This method parallels the dropPoisonGeneratingFlags on Instruction, but is hoisted to operator to handle constant expressions as well.

This is mostly code movement, but I did go ahead and add the inrange constexpr gep case.  This had been discussed previously, but apparently never followed up o.
2021-10-27 11:25:40 -07:00
Roman Lebedev ab1dbcecd6
[IR] `IRBuilderBase::CreateSelect()`: if cond is a constant i1, short-circuit
While we could emit such a tautological `select`,
it will stick around until the next instsimplify invocation,
which may happen after we count the cost of this redundant `select`.
Which is precisely what happens with loop vectorization legality checks,
and that artificially increases the cost of said checks,
which is bad.

There is prior art for this in `IRBuilderBase::CreateAnd()`/`IRBuilderBase::CreateOr()`.

Refs. https://reviews.llvm.org/D109368#3089809
2021-10-27 18:01:05 +03:00
Yuanfang Chen 7c3fa52785 [DebugInfo] Skip ODRUniquing for mismatched tags
Otherwise, ODRUniquing would map some member method/variable MDNodes
to have enum type DIScope, resulting in invalid debug info and bad
DWARF.

- Add a Verifier check that when a 'scope:' operand is an ODR type that is not an enum.
- Makes ODRUniquing apply to only ODR types with the same tag so that the debuginfo/DWARF is well-formed.

Reviewed By: probinson, aprantl

Differential Revision: https://reviews.llvm.org/D111770
2021-10-26 15:28:25 -07:00
Nikita Popov 75384ecdf8 [InstSimplify] Refactor invariant.group load folding
Currently strip.invariant/launder.invariant are handled by
constructing constant expressions with the intrinsics skipped.
This takes an alternative approach of accumulating the offset
using stripAndAccumulateConstantOffsets(), with a flag to look
through invariant.group intrinsics.

Differential Revision: https://reviews.llvm.org/D112382
2021-10-25 10:56:25 +02:00
Kazu Hirata 1c35973c77 [llvm] Call *(Set|Map)::erase directly (NFC)
We can erase an item in a set or map without checking its membership
first.
2021-10-24 09:32:59 -07:00
Kazu Hirata d14d7068b6 [llvm] Use StringRef::contains (NFC) 2021-10-23 08:45:27 -07:00
Michał Górny 66e06cc8cb [llvm] [ADT] Update llvm::Split() per Pavel Labath's suggestions
Optimize the iterator comparison logic to compare Current.data()
pointers.  Use std::tie for assignments from std::pair.  Replace
the custom class with a function returning iterator_range.

Differential Revision: https://reviews.llvm.org/D110535
2021-10-22 12:27:46 +02:00
Florian Hahn d465315679
[LLVM-C]Add LLVMAddMetadataToInst, deprecated LLVMSetInstDebugLocation.
IRBuilder has been updated to support preserving metdata in a more
general manner. This patch adds `LLVMAddMetadataToInst` and
deprecates `LLVMSetInstDebugLocation` in favor of the more
general function.

Reviewed By: aprantl

Differential Revision: https://reviews.llvm.org/D93454
2021-10-22 11:21:28 +01:00
Yonghong Song f6811cec84 [DebugInfo] Support typedef with btf_decl_tag attributes
Clang patch ([1]) added support for btf_decl_tag attributes with typedef
types. This patch added llvm support including dwarf generation.
For example, for typedef
   typedef unsigned * __u __attribute__((btf_decl_tag("tag1")));
   __u u;
the following shows llvm-dwarfdump result:
   0x00000033:   DW_TAG_typedef
                   DW_AT_type      (0x00000048 "unsigned int *")
                   DW_AT_name      ("__u")
                   DW_AT_decl_file ("/home/yhs/work/tests/llvm/btf_tag/t.c")
                   DW_AT_decl_line (1)

   0x0000003e:     DW_TAG_LLVM_annotation
                     DW_AT_name    ("btf_decl_tag")
                     DW_AT_const_value     ("tag1")

   0x00000047:     NULL

  [1] https://reviews.llvm.org/D110127

Differential Revision: https://reviews.llvm.org/D110129
2021-10-21 08:42:58 -07:00
Itay Bookstein 08ed216000 [IR] Refactor GlobalIFunc to inherit from GlobalObject, Remove GlobalIndirectSymbol
As discussed in:
* https://reviews.llvm.org/D94166
* https://lists.llvm.org/pipermail/llvm-dev/2020-September/145031.html

The GlobalIndirectSymbol class lost most of its meaning in
https://reviews.llvm.org/D109792, which disambiguated getBaseObject
(now getAliaseeObject) between GlobalIFunc and everything else.
In addition, as long as GlobalIFunc is not a GlobalObject and
getAliaseeObject returns GlobalObjects, a GlobalAlias whose aliasee
is a GlobalIFunc cannot currently be modeled properly. Creating
aliases for GlobalIFuncs does happen in the wild (e.g. glibc). In addition,
calling getAliaseeObject on a GlobalIFunc will currently return nullptr,
which is undesirable because it should return the object itself for
non-aliases.

This patch refactors the GlobalIFunc class to inherit directly from
GlobalObject, and removes GlobalIndirectSymbol (while inlining the
relevant parts into GlobalAlias and GlobalIFunc). This allows for
calling getAliaseeObject() on a GlobalIFunc to return the GlobalIFunc
itself, making getAliaseeObject() more consistent and enabling
alias-to-ifunc to be properly modeled in the IR.

I exercised some judgement in the API clients of GlobalIndirectSymbol:
some were 'monomorphized' for GlobalAlias and GlobalIFunc, and
some remained shared (with the type adapted to become GlobalValue).

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D108872
2021-10-20 10:29:47 -07:00
Arthur Eubanks ac0561ebb7 [Verifier] Add context for assume operand bundles verifier errors
And fix a typo.
2021-10-19 09:52:04 -07:00
Arthur Eubanks b8ce97372d [NewPM] Add PipelineTuningOption to eagerly invalidate analyses
This trades off more compile time for less peak memory usage. Right now
it invalidates all function analyses after a module->function or
cgscc->function adaptor.

https://llvm-compile-time-tracker.com/compare.php?from=1fb24fe85a19ae71b00875ff6c96ef1831dcf7e3&to=cb28ddb063c87f0d5df89812ab2de9a69dd276db&stat=instructions
https://llvm-compile-time-tracker.com/compare.php?from=1fb24fe85a19ae71b00875ff6c96ef1831dcf7e3&to=cb28ddb063c87f0d5df89812ab2de9a69dd276db&stat=max-rss

For now this is just experimental.

See comments on why this may affect optimizations.

Reviewed By: asbirlea, nikic

Differential Revision: https://reviews.llvm.org/D111575
2021-10-18 13:20:35 -07:00
Mircea Trofin 8612b47a8e [NFC] ProfileSummary: const a bunch of members and fields.
It helps readability and maintainability (don't need to chase down
writes to a field I see is const, for example)
2021-10-18 08:55:06 -07:00
Stephen Tozer b9ca73e1a8 [DebugInfo] Correctly handle arrays with 0-width elements in GEP salvaging
Fixes an issue where GEP salvaging did not properly account for GEP
instructions which stepped over array elements of width 0 (effectively a
no-op). This unnecessarily produced long expressions by appending
`... + (x * 0)` and potentially extended the number of SSA values used
in the dbg.value. This also erroneously triggered an assert in the
salvage function that the element width would be strictly positive.
These issues are resolved by simply ignoring these useless operands.

Reviewed By: aprantl

Differential Revision: https://reviews.llvm.org/D111809
2021-10-18 12:01:12 +01:00
Nikita Popov 274b2439f8 [ConstantRange] Add fast signed multiply
The multiply() implementation is very slow -- it performs six
multiplications in double the bitwidth, which means that it will
typically work on allocated APInts and bypass fast-path
implementations. Add an additional implementation that doesn't
try to produce anything better than a full range if overflow is
possible. At least for the BasicAA use-case, we really don't care
about more precise modeling of overflow behavior. The current
use of multiply() is fine while the implementation is limited to
a single index, but extending it to the multiple-index case makes
the compile-time impact untenable.
2021-10-17 16:41:49 +02:00
Nikita Popov 587493b441 [ConstantRange] Compute precise shl range for single elements
For the common case where the shift amount is constant (a single
element range) we can easily compute a precise range (up to
unsigned envelope), so do that.
2021-10-15 23:44:41 +02:00
Craig Topper 24703cb6a4 [IR] Fix a few incorrect paths in file header comments. NFC 2021-10-15 09:18:57 -07:00
Kazu Hirata 81e9c90686 [llvm] Use llvm::is_contained (NFC) 2021-10-14 22:44:09 -07:00
Hongtao Yu 098a0d8fbc [CSSPGO] Unblock optimizations with pseudo probe instrumentation part 3.
This patch continues unblocking optimizations that are blocked by pseudo probe instrumentation.

Not exactly like DbgIntrinsics, PseudoProbe intrinsic has other attributes (such as mayread, maywrite, mayhaveSideEffect) that can block optimizations. The issues fixed are:
- Flipped default param of getFirstNonPHIOrDbg API to skip pseudo probes
- Unblocked CSE by avoiding pseudo probe from clobbering memory SSA
- Unblocked induction variable simpliciation
- Allow empty loop deletion by treating probe intrinsic isDroppable
- Some refactoring.

Reviewed By: wenlei

Differential Revision: https://reviews.llvm.org/D110847
2021-10-12 09:44:12 -07:00
Nikita Popov a94002cd64 [Type] Avoid APFloat.h include (NFC)
This is only used by a handful of methods working on fltSemantics,
and having these defined inline in the header does not look
particularly important.
2021-10-09 11:29:26 +02:00
Philip Reames de5477ed42 Add a statistic to track number of times we rebuild instruction ordering
The goal here is to assist some future tuning work both on instruction ordering invalidation, and on some client code which uses it.
2021-10-08 10:59:34 -07:00
Kevin P. Neal 97c231666a [NFC] Rename functions to match our naming scheme.
In the review of D111085 it was pointed out that these functions don't
conform to the naming scheme in use in LLVM. With this commit we should
be good for all of FPEnv.h.
2021-10-07 14:12:41 -04:00
Erik Desjardins 11c8efd4db [Inline] Introduce Constant::hasOneLiveUse, use it instead of hasOneUse in inline cost model (PR51667)
Otherwise, inlining costs may be pessimized by dead constants.

Fixes https://bugs.llvm.org/show_bug.cgi?id=51667.

Reviewed By: mtrofin, aeubanks

Differential Revision: https://reviews.llvm.org/D109294
2021-10-07 08:33:25 -07:00
Itay Bookstein 40ec1c0f16 [IR][NFC] Rename getBaseObject to getAliaseeObject
To better reflect the meaning of the now-disambiguated {GlobalValue,
GlobalAlias}::getBaseObject after breaking off GlobalIFunc::getResolverFunction
(D109792), the function is renamed to getAliaseeObject.
2021-10-06 19:33:10 -07:00
Arthur Eubanks 05392466f0 Reland [IR] Increase max alignment to 4GB
Currently the max alignment representable is 1GB, see D108661.
Setting the align of an object to 4GB is desirable in some cases to make sure the lower 32 bits are clear which can be used for some optimizations, e.g. https://crbug.com/1016945.

This uses an extra bit in instructions that carry an alignment. We can store 15 bits of "free" information, and with this change some instructions (e.g. AtomicCmpXchgInst) use 14 bits.
We can increase the max alignment representable above 4GB (up to 2^62) since we're only using 33 of the 64 values, but I've just limited it to 4GB for now.

The one place we have to update the bitcode format is for the alloca instruction. It stores its alignment into 5 bits of a 32 bit bitfield. I've added another field which is 8 bits and should be future proof for a while. For backward compatibility, we check if the old field has a value and use that, otherwise use the new field.

Updating clang's max allowed alignment will come in a future patch.

Reviewed By: hans

Differential Revision: https://reviews.llvm.org/D110451
2021-10-06 13:29:23 -07:00
Arthur Eubanks 569346f274 Revert "Reland [IR] Increase max alignment to 4GB"
This reverts commit 8d64314ffe.
2021-10-06 11:38:11 -07:00
Arthur Eubanks 8d64314ffe Reland [IR] Increase max alignment to 4GB
Currently the max alignment representable is 1GB, see D108661.
Setting the align of an object to 4GB is desirable in some cases to make sure the lower 32 bits are clear which can be used for some optimizations, e.g. https://crbug.com/1016945.

This uses an extra bit in instructions that carry an alignment. We can store 15 bits of "free" information, and with this change some instructions (e.g. AtomicCmpXchgInst) use 14 bits.
We can increase the max alignment representable above 4GB (up to 2^62) since we're only using 33 of the 64 values, but I've just limited it to 4GB for now.

The one place we have to update the bitcode format is for the alloca instruction. It stores its alignment into 5 bits of a 32 bit bitfield. I've added another field which is 8 bits and should be future proof for a while. For backward compatibility, we check if the old field has a value and use that, otherwise use the new field.

Updating clang's max allowed alignment will come in a future patch.

Reviewed By: hans

Differential Revision: https://reviews.llvm.org/D110451
2021-10-06 11:03:51 -07:00
Arthur Eubanks 72cf8b6044 Revert "[IR] Increase max alignment to 4GB"
This reverts commit df84c1fe78.

Breaks some bots
2021-10-06 10:21:35 -07:00
Arthur Eubanks df84c1fe78 [IR] Increase max alignment to 4GB
Currently the max alignment representable is 1GB, see D108661.
Setting the align of an object to 4GB is desirable in some cases to make sure the lower 32 bits are clear which can be used for some optimizations, e.g. https://crbug.com/1016945.

This uses an extra bit in instructions that carry an alignment. We can store 15 bits of "free" information, and with this change some instructions (e.g. AtomicCmpXchgInst) use 14 bits.
We can increase the max alignment representable above 4GB (up to 2^62) since we're only using 33 of the 64 values, but I've just limited it to 4GB for now.

The one place we have to update the bitcode format is for the alloca instruction. It stores its alignment into 5 bits of a 32 bit bitfield. I've added another field which is 8 bits and should be future proof for a while. For backward compatibility, we check if the old field has a value and use that, otherwise use the new field.

Updating clang's max allowed alignment will come in a future patch.

Reviewed By: hans

Differential Revision: https://reviews.llvm.org/D110451
2021-10-06 09:54:14 -07:00
Simon Pilgrim 21661607ca [llvm] Replace report_fatal_error(std::string) uses with report_fatal_error(Twine)
As described on D111049, we're trying to remove the <string> dependency from error handling and replace uses of report_fatal_error(const std::string&) with the Twine() variant which can be forward declared.
2021-10-06 12:04:30 +01:00
Kazu Hirata e6e29831dd [IR] Migrate from getNumArgOperands to arg_size (NFC)
Note that arg_operands is considered a legacy name.  See
llvm/include/llvm/IR/InstrTypes.h for details.
2021-10-04 08:40:25 -07:00
Jay Foad 566690b067 [APFloat] Remove BitWidth argument from getAllOnesValue
There's no need to pass this in explicitly because it is
trivially available from the semantics.
2021-10-04 11:32:16 +01:00
Jay Foad a9bceb2b05 [APInt] Stop using soft-deprecated constructors and methods in llvm. NFC.
Stop using APInt constructors and methods that were soft-deprecated in
D109483. This fixes all the uses I found in llvm, except for the APInt
unit tests which should still test the deprecated methods.

Differential Revision: https://reviews.llvm.org/D110807
2021-10-04 08:57:44 +01:00
Nikita Popov 5ddf49b906 [AttrBuilder] Make handling of int attribtues more generifc (NFC)
This is basically the same change as 42cc7f3c52
but for integer attributes. Rather than treating each attribute
individually, handle them all the same way. The only thing that
needs to be done per attribute is specify how get/add convert
from/to the raw representation.
2021-10-03 23:42:28 +02:00
Dávid Bolvanský 5f2f611880 Fixed more warnings in LLVM produced by -Wbitwise-instead-of-logical 2021-10-03 13:58:10 +02:00
Min-Yih Hsu 475de8da01 [IR]PATCH 2/2: Add MDNode::printTree and dumpTree
This patch adds the functionalities to print MDNode in tree shape. For
example, instead of printing a MDNode like this:
```
<0x5643e1166888> = !DILocalVariable(name: "foo", arg: 2, scope: <0x5643e11c9740>, file: <0x5643e11c6ec0>, line: 8, type: <0x5643e11ca8e0>, flags: DIFlagPublic | DIFlagFwdDecl, align: 8)
```
The printTree/dumpTree functions can give you:
```
<0x5643e1166888> = !DILocalVariable(name: "foo", arg: 2, scope: <0x5643e11c9740>, file: <0x5643e11c6ec0>, line: 8, type: <0x5643e11ca8e0>, flags: DIFlagPublic | DIFlagFwdDecl, align: 8)
  <0x5643e11c9740> = distinct !DISubprogram(scope: null, spFlags: 0)
  <0x5643e11c6ec0> = distinct !DIFile(filename: "file.c", directory: "/path/to/dir")
  <0x5643e11ca8e0> = distinct !DIDerivedType(tag: DW_TAG_pointer_type, baseType: <0x5643e11668d8>, size: 1, align: 2)
    <0x5643e11668d8> = !DIBasicType(tag: DW_TAG_unspecified_type, name: "basictype")
```
Which is useful when using it in debugger. Where sometimes printing the
whole module to see all MDNodes is too expensive.

Differential Revision: https://reviews.llvm.org/D110113
2021-10-02 21:19:52 -07:00
Min-Yih Hsu b2d078fb0c [IR]PATCH 1/2: Add AsmWriterContext into AsmWriter
AsmWriterContext is a simple compound that stores TypePrinting,
SlotTracker (i.e. "Machine" in AsmWriter), and Module instances -- three
of the most commonly used objects in the AsmWriter infrastructure.
Previously these three objects are passed as separate function arguments
to most of the printer functions in this file. Tidying them up can bring
easier code refactoring on printer functions in the future (e.g. when we
want to pass additional objects to all printer functions).

NOTE: Theoritically, this patch should be NFC.

Differential Revision: https://reviews.llvm.org/D110112
2021-10-02 21:19:51 -07:00
Alfsonso Gregory 060a96a7b5 [LLVM][IR] Fixed input arguments for Verifier getter
ParameterABIAttributes functions work with unsigned integers as the index, so having the getter be signed makes no sense. Additionally, for this reason, the loop vars that were signed were changed to unsigned too.

Reviewed By: jeroen.dobbelaere

Differential Revision: https://reviews.llvm.org/D110344
2021-10-03 08:09:30 +05:30
Arthur Eubanks a7b4ce9cfd [NFC][AttributeList] Replace index_begin/end with an iterator
We expose the fact that we rely on unsigned wrapping to iterate through
all indexes. This can be confusing. Rather, keeping it as an
implementation detail through an iterator is less confusing and is less
code.

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D110885
2021-10-01 10:17:41 -07:00
Koutheir Attouchi 16661b1a3c Expose `DIBuilder::finalizeSubprogram()` through the LLVM C API
The LLVM C API function is called `LLVMDIBuilderFinalizeSubprogram()`.

Reviewed By: CodaFi

Differential Revision: https://reviews.llvm.org/D104794
2021-09-30 20:59:41 -07:00
Kazu Hirata f631173d80 [llvm] Migrate from arg_operands to args (NFC)
Note that arg_operands is considered a legacy name.  See
llvm/include/llvm/IR/InstrTypes.h for details.
2021-09-30 08:51:21 -07:00
Wesley Wiser 2dd883439c [Mangler] Calculate the argument list byte count suffix correctly when returning large values
`__stdcall`, `__fastcall` and `__vectorcall` return large values via a
hidden pointer argument. However, the size of that argument should not
be included in the argument list byte count suffix added to the
function's decorated name.

This patch fixes that issue so that LLVM generates the same decorated
name as MSVC does.

MSVC example: https://godbolt.org/z/nc35MKPhr

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D110719
2021-09-29 11:42:28 -07:00
Arthur Eubanks aa53785f23 Reland [clang] Rework dontcall attributes
To avoid using the AST when emitting diagnostics, split the "dontcall"
attribute into "dontcall-warn" and "dontcall-error", and also add the
frontend attribute value as the LLVM attribute value. This gives us all
the information to report diagnostics we need from within the IR (aside
from access to the original source).

One downside is we directly use LLVM's demangler rather than using the
existing Clang diagnostic pretty printing of symbols.

Previous revisions didn't properly declare the new dependencies.

Reviewed By: nickdesaulniers

Differential Revision: https://reviews.llvm.org/D110364
2021-09-28 15:31:30 -07:00
Arthur Eubanks 7833d20f1f Revert "[clang] Rework dontcall attributes"
This reverts commit 2943071e2e.

Breaks bots
2021-09-28 14:49:27 -07:00
Arthur Eubanks 2943071e2e [clang] Rework dontcall attributes
To avoid using the AST when emitting diagnostics, split the "dontcall"
attribute into "dontcall-warn" and "dontcall-error", and also add the
frontend attribute value as the LLVM attribute value. This gives us all
the information to report diagnostics we need from within the IR (aside
from access to the original source).

One downside is we directly use LLVM's demangler rather than using the
existing Clang diagnostic pretty printing of symbols.

Reviewed By: nickdesaulniers

Differential Revision: https://reviews.llvm.org/D110364
2021-09-28 14:21:10 -07:00
hyeongyu kim 86bf234d0b [IR] Change the default value of InstertElement to poison (1/4)
This patch is for fixing potential insertElement-related bugs like D93818.
```
V = UndefValue::get(VecTy);
for(...)
  V = Builder.CreateInsertElementy(V, Elt, Idx);
=>
V = PoisonValue::get(VecTy);
for(...)
  V = Builder.CreateInsertElementy(V, Elt, Idx);
```
Like above, this patch changes the placeholder V to poison.
The patch will be separated into several commits.

Reviewed By: aqjune

Differential Revision: https://reviews.llvm.org/D110311
2021-09-28 22:29:16 +09:00
“bhkumarn” 62eeacce17 [DebugInfo] Emit DW_TAG_namelist and DW_TAG_namelist_item
This patch emits DW_TAG_namelist and DW_TAG_namelist_item for fortran
namelist variables. DICompositeType is extended to support this fortran
feature.

Reviewed By: aprantl

Differential Revision: https://reviews.llvm.org/D108553
2021-09-28 14:40:58 +05:30
modimo 20faf78919 [ThinLTO] Add noRecurse and noUnwind thinlink function attribute propagation
Thinlink provides an opportunity to propagate function attributes across modules, enabling additional propagation opportunities.

This change propagates (currently default off, turn on with `disable-thinlto-funcattrs=1`) noRecurse and noUnwind based off of function summaries of the prevailing functions in bottom-up call-graph order. Testing on clang self-build:
1. There's a 35-40% increase in noUnwind functions due to the additional propagation opportunities.
2. Throughput is measured at 10-15% increase in thinlink time which itself is 1.5% of E2E link time.

Implementation-wise this adds the following summary function attributes:
1. noUnwind: function is noUnwind
2. mayThrow: function contains a non-call instruction that `Instruction::mayThrow` returns true on (e.g. windows SEH instructions)
3. hasUnknownCall: function contains calls that don't make it into the summary call-graph thus should not be propagated from (e.g. indirect for now, could add no-opt functions as well)

Testing:
Clang self-build passes and 2nd stage build passes check-all
ninja check-all with newly added tests passing

Reviewed By: tejohnson

Differential Revision: https://reviews.llvm.org/D36850
2021-09-27 12:28:07 -07:00
Simon Pilgrim ee267b1c7c [IR] DIBuilder::createEnumerator - pass APSInt by const reference
Avoid unnecessary copy by value.
2021-09-25 11:58:06 +01:00