Commit Graph

1857 Commits

Author SHA1 Message Date
Scott Linder 8cd35ad854 [DebugInfo] Enforce implicit constraints on `distinct` MDNodes
Add UNIQUED and DISTINCT properties in Metadata.def and use them to
implement restrictions on the `distinct` property of MDNodes:

* DIExpression can currently be parsed from IR or read from bitcode
  as `distinct`, but this property is silently dropped when printing
  to IR. This causes accepted IR to fail to round-trip. As DIExpression
  appears inline at each use in the canonical form of IR, it cannot
  actually be `distinct` anyway, as there is no syntax to describe it.
* Similarly, DIArgList is conceptually always uniqued. It is currently
  restricted to only appearing in contexts where there is no syntax for
  `distinct`, but for consistency it is treated equivalently to
  DIExpression in this patch.
* DICompileUnit is already restricted to always being `distinct`, but
  along with adding general support for the inverse restriction I went
  ahead and described this in Metadata.def and updated the parser to be
  general. Future nodes which have this restriction can share this
  support.

The new UNIQUED property applies to DIExpression and DIArgList, and
forbids them to be `distinct`. It also implies they are canonically
printed inline at each use, rather than via MDNode ID.

The new DISTINCT property applies to DICompileUnit, and requires it to
be `distinct`.

A potential alternative change is to forbid the non-inline syntax for
DIExpression entirely, as is done with DIArgList implicitly by requiring
it appear in the context of a function. For example, we would forbid:

    !named = !{!0}
    !0 = !DIExpression()

Instead we would only accept the equivalent inlined version:

    !named = !{!DIExpression()}

This essentially removes the ability to create a `distinct` DIExpression
by construction, as there is no syntax for `distinct` inline. If this
patch is accepted as-is, the result would be that the non-canonical
version is accepted, but the following would be an error and produce a diagnostic:

    !named = !{!0}
    ; error: 'distinct' not allowed for !DIExpression()
    !0 = distinct !DIExpression()

Also update some documentation to consistently use the inline syntax for
DIExpression, and to describe the restrictions on `distinct` for nodes
where applicable.

Reviewed By: StephenTozer, t-tye

Differential Revision: https://reviews.llvm.org/D104827
2021-06-28 21:20:04 +00:00
Nathan Chancellor 4ae0ab095b
[BitCode] Add noprofile to getAttrFromCode()
After D104475 / D104658, building the Linux kernel with ThinLTO is
broken:

ld.lld: error: Unknown attribute kind (73) (Producer: 'LLVM13.0.0git'
Reader: 'LLVM 13.0.0git')

getAttrFromCode() has never handled this attribute so it is written
during the ThinLTO phase but it cannot be handled during the linking
phase.

Add noprofile to getAttrFromCode() so that disassembly works properly.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D104995
2021-06-27 11:59:57 -07:00
Nikita Popov 5b2573e9c7 [OpaquePtr] Enumerate GlobalAlias value type
The type is no longer implicitly enumerated through the pointer
type.
2021-06-25 21:21:10 +02:00
Nikita Popov 18d7e822ab [OpaquePtr] Enumerate alloca type
This is no longer implicitly enumerated through the pointer type.
2021-06-25 10:48:57 +02:00
Nikita Popov 536872a1f7 [OpaquePtr] Enumerate global variable type
This is no longer implicitly enumerated through the pointer type.
2021-06-25 10:46:28 +02:00
Arthur Eubanks 4c8174f54b [OpaquePtr] Introduce option to force all pointers to be opaque pointers
We don't want to start updating tests to use opaque pointers until we're
close to the opaque pointer transition. However, before the transition
we want to run tests as if pointers are opaque pointers to see if there
are any crashes.

At some point when we have a flag to only create opaque pointers in the
bitcode and textual IR readers, and when we have fixed all places that
try to read a pointee type, this flag will be useless. However, until
then, this can help us find issues more easily.

Since the cl::opt is read into LLVMContext, we need to make sure
LLVMContext is created after cl::ParseCommandLineOptions().

Previously ValueEnumerator would visit the value types of global values
via the pointer type, but with opaque pointers we have to manually visit
the value type.

Reviewed By: nikic, dexonsmith

Differential Revision: https://reviews.llvm.org/D103503
2021-06-24 13:32:31 -07:00
Nikita Popov e5f2b035dd [OpaquePtr] Support invoke instruction
With call support in place, this is only a matter of relaxing a
bitcode reader assertion.
2021-06-23 20:24:33 +02:00
Nikita Popov f660af46e3 [OpaquePtr] Support call instruction
Add support for call of opaque pointer, currently only possible for
indirect calls.

This requires a bit of special casing in LLParser, as calls do not
specify the callee operand type explicitly.

Differential Revision: https://reviews.llvm.org/D104740
2021-06-23 20:17:26 +02:00
Florian Hahn 34cccdaed7
[BitcodeReader] Validate Strtab before accessing.
This fixes a crash with invalid bitcode files that have records
referencing names in Strtab, but Strtab is not present or the index is
out-of-bounds.

This fixes the following clusterfuzz issue:
https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=29895

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D95554
2021-06-22 14:52:16 +01:00
Nikita Popov d9fe96fe26 [OpaquePtr] Support opaque constant expression GEP
Adjust assertions to use isOpaqueOrPointeeTypeMatches() and make
it return an opaque pointer result for an opaque base pointer. We
also need to enumerate the element type, as it is no longer
implicitly enumerated through the pointer type.

Differential Revision: https://reviews.llvm.org/D104655
2021-06-21 20:06:25 +02:00
Nikita Popov 9f779195d3 [OpaquePtr] Return opaque pointer from opaque pointer GEP
For a GEP on an opaque pointer, also return an opaque pointer (or
vector of opaque pointer) result.

This requires explicitly enumerating the GEP source element type,
because it is now no longer implicitly enumerated as part of either
the source or result pointer types.

Differential Revision: https://reviews.llvm.org/D104652
2021-06-21 18:36:32 +02:00
Arthur Eubanks cc8d32ae7d Move some code under NDEBUG from D103135 2021-06-14 11:39:12 -07:00
Arthur Eubanks 75d3b46ad2 Remove accidentally added debugging code from D103135 2021-06-14 11:11:40 -07:00
Arthur Eubanks 8c5a44901c [OpaquePtr] Remove existing support for forward compatibility
It assumes that PointerType will keep having an optional pointee type,
but we'd like to remove the pointee type in PointerType at some point.

I feel like the current implementation could be simplified anyway,
although perhaps I'm underestimating the amount of work needed
throughout BitcodeReader.

We will still need a side table to keep track of pointee types. This
will be reimplemented at some point.

This is essentially a revert of a4771e9d (which doesn't look like it was
reviewed anyway).

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D103135
2021-06-14 10:52:56 -07:00
Arthur Eubanks 1202f559bd [OpaquePtr] Make atomicrmw work with opaque pointers
FullTy is only necessary when we need to figure out what type an
instruction works with given a pointer's pointee type. However, we just
end up using the value operand's type, so FullTy isn't necessary.

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D102788
2021-05-25 20:16:21 -07:00
Arthur Eubanks ad90a6be21 [OpaquePtr] Create new bitcode encoding for atomicrmw
Since the opaque pointer type won't contain the pointee type, we need to
separately encode the value type for an atomicrmw.

Emit this new code for atomicrmw.

Handle this new code and the old one in the bitcode reader.

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D103123
2021-05-25 16:30:34 -07:00
Arthur Eubanks 0bbb502daa Revert "[OpaquePtr] Make atomicrmw work with opaque pointers"
This reverts commit 0bebda17be.

Causing "Invalid record" errors.
2021-05-25 10:14:58 -07:00
Marco Elver 280333021e [SanitizeCoverage] Add support for NoSanitizeCoverage function attribute
We really ought to support no_sanitize("coverage") in line with other
sanitizers. This came up again in discussions on the Linux-kernel
mailing lists, because we currently do workarounds using objtool to
remove coverage instrumentation. Since that support is only on x86, to
continue support coverage instrumentation on other architectures, we
must support selectively disabling coverage instrumentation via function
attributes.

Unfortunately, for SanitizeCoverage, it has not been implemented as a
sanitizer via fsanitize= and associated options in Sanitizers.def, but
rolls its own option fsanitize-coverage. This meant that we never got
"automatic" no_sanitize attribute support.

Implement no_sanitize attribute support by special-casing the string
"coverage" in the NoSanitizeAttr implementation. To keep the feature as
unintrusive to existing IR generation as possible, define a new negative
function attribute NoSanitizeCoverage to propagate the information
through to the instrumentation pass.

Fixes: https://bugs.llvm.org/show_bug.cgi?id=49035

Reviewed By: vitalybuka, morehouse

Differential Revision: https://reviews.llvm.org/D102772
2021-05-25 12:57:14 +02:00
Arthur Eubanks 7a29a12301 [Verifier] Move some atomicrmw/cmpxchg checks to instruction creation
These checks already exist as asserts when creating the corresponding
instruction. Anybody creating these instructions already need to take
care to not break these checks.

Move the checks for success/failure ordering in cmpxchg from the
verifier to the LLParser and BitcodeReader plus an assert.

Add some tests for cmpxchg ordering. The .bc files are created from the
.ll files with an llvm-as with these checks disabled.

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D102803
2021-05-21 13:41:17 -07:00
Steven Wu 5b6cae5524 [IR][AutoUpgrade] Drop alignment from non-pointer parameters and returns
This is a follow-up of D102201. After some discussion, it is a better idea
to upgrade all invalid uses of alignment attributes on function return
values and parameters, not just limited to void function return types.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D102726
2021-05-20 09:54:38 -07:00
Arthur Eubanks 0bebda17be [OpaquePtr] Make atomicrmw work with opaque pointers
FullTy is only necessary when we need to figure out what type an
instruction works with given a pointer's pointee type. However, we just
end up using the value operand's type, so FullTy isn't necessary.

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D102788
2021-05-19 12:49:28 -07:00
Arthur Eubanks 28b9771472 [OpaquePtr] Make GEPs work with opaque pointers
No verifier changes needed, the verifier currently doesn't check that
the pointer operand's pointee type matches the GEP type. There is a
similar check in GetElementPtrInst::Create() though.

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D102744
2021-05-19 12:39:37 -07:00
Arthur Eubanks 6013d84392 [OpaquePtr] Make loads and stores work with opaque pointers
Don't check that types match when the pointer operand is an opaque
pointer.

I would separate the Assembler and Verifier changes, but
verify-uselistorder in the Assembler test ends up running the verifier.

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D102450
2021-05-18 13:43:50 -07:00
Tim Northover ea0eec69f1 IR+AArch64: add a "swiftasync" argument attribute.
This extends any frame record created in the function to include that
parameter, passed in X22.

The new record looks like [X22, FP, LR] in memory, and FP is stored with 0b0001
in bits 63:60 (CodeGen assumes they are 0b0000 in normal operation). The effect
of this is that tools walking the stack should expect to see one of three
values there:

  * 0b0000 => a normal, non-extended record with just [FP, LR]
  * 0b0001 => the extended record [X22, FP, LR]
  * 0b1111 => kernel space, and a non-extended record.

All other values are currently reserved.

If compiling for arm64e this context pointer is address-discriminated with the
discriminator 0xc31a and the DB (process-specific) key.

There is also an "i8** @llvm.swift.async.context.addr()" intrinsic providing
front-ends access to this slot (and forcing its creation initialized to nullptr
if necessary).
2021-05-14 11:43:58 +01:00
Arthur Eubanks 2155dc51d7 [IR] Introduce the opaque pointer type
The opaque pointer type is essentially just a normal pointer type with a
null pointee type.

This also adds support for the opaque pointer type to the bitcode
reader/writer, as well as to textual IR.

To avoid confusion with existing pointer types, we disallow creating a
pointer to an opaque pointer.

Opaque pointer types should not be widely used at this point since many
parts of LLVM still do not support them. The next steps are to add some
very simple use cases of opaque pointers to make sure they work, then
start pretending that all pointers are opaque pointers and see what
breaks.

https://lists.llvm.org/pipermail/llvm-dev/2021-May/150359.html

Reviewed By: dblaikie, dexonsmith, pcc

Differential Revision: https://reviews.llvm.org/D101704
2021-05-13 15:22:27 -07:00
cynecx 8ec9fd4839 Support unwinding from inline assembly
I've taken the following steps to add unwinding support from inline assembly:

1) Add a new `unwind` "attribute" (like `sideeffect`) to the asm syntax:

```
invoke void asm sideeffect unwind "call thrower", "~{dirflag},~{fpsr},~{flags}"()
    to label %exit unwind label %uexit
```

2.) Add Bitcode writing/reading support + LLVM-IR parsing.

3.) Emit EHLabels around inline assembly lowering (SelectionDAGBuilder + GlobalISel) when `InlineAsm::canThrow` is enabled.

4.) Tweak InstCombineCalls/InlineFunction pass to not mark inline assembly "calls" as nounwind.

5.) Add clang support by introducing a new clobber: "unwind", which lower to the `canThrow` being enabled.

6.) Don't allow unwinding callbr.

Reviewed By: Amanieu

Differential Revision: https://reviews.llvm.org/D95745
2021-05-13 19:13:03 +01:00
Steven Wu 4eff946947 [IR][AutoUpgrade] Drop align attribute from void return types
Since D87304, `align` become an invalid attribute on none pointer types and
verifier will reject bitcode that has invalid `align` attribute.

The problem is before the change, DeadArgumentElimination can easily
turn a pointer return type into a void return type without removing
`align` attribute. Teach Autograde to remove invalid `align` attribute
from return types to maintain bitcode compatibility.

rdar://77022993

Reviewed By: dexonsmith

Differential Revision: https://reviews.llvm.org/D102201
2021-05-11 08:23:55 -07:00
Stephen Tozer e5d844b587 [Bitcode] Ensure DIArgList in bitcode has no null or forward metadata refs
This patch fixes an issue in which ConstantAsMetadata arguments to a
DIArglist, as well as the Constant values referenced by that metadata,
would not be always be emitted correctly into bitcode. This patch fixes
this issue firstly by searching for ConstantAsMetadata in DIArgLists
(previously we would only search for them when directly wrapped in
MetadataAsValue), and secondly by enumerating all of a DIArgList's
arguments directly prior to enumerating the DIArgList itself.

This patch also adds a number of asserts, and no longer treats the
arguments to a DIArgList as optional fields when reading/writing to
bitcode.

Differential Revision: https://reviews.llvm.org/D100572
2021-04-22 12:03:33 +01:00
Matt Arsenault 9a0c9402fa Reapply "OpaquePtr: Turn inalloca into a type attribute"
This reverts commit 07e46367ba.
2021-03-29 08:55:30 -04:00
Oliver Stannard 07e46367ba Revert "Reapply "OpaquePtr: Turn inalloca into a type attribute""
Reverting because test 'Bindings/Go/go.test' is failing on most
buildbots.

This reverts commit fc9df30991.
2021-03-29 11:32:22 +01:00
Matt Arsenault fc9df30991 Reapply "OpaquePtr: Turn inalloca into a type attribute"
This reverts commit 20d5c42e0e.
2021-03-28 13:35:21 -04:00
Nico Weber 20d5c42e0e Revert "OpaquePtr: Turn inalloca into a type attribute"
This reverts commit 4fefed6563.
Broke check-clang everywhere.
2021-03-28 13:02:52 -04:00
Matt Arsenault 4fefed6563 OpaquePtr: Turn inalloca into a type attribute
I think byval/sret and the others are close to being able to rip out
the code to support the missing type case. A lot of this code is
shared with inalloca, so catch this up to the others so that can
happen.
2021-03-28 11:12:23 -04:00
Bradley Smith 48f5a392cb [IR] Add vscale_range IR function attribute
This attribute represents the minimum and maximum values vscale can
take. For now this attribute is not hooked up to anything during
codegen, this will be added in the future when such codegen is
considered stable.

Additionally hook up the -msve-vector-bits=<x> clang option to emit this
attribute.

Differential Revision: https://reviews.llvm.org/D98030
2021-03-22 12:05:06 +00:00
gbtozers e5d958c456 [DebugInfo] Support DIArgList in DbgVariableIntrinsic
This patch updates DbgVariableIntrinsics to support use of a DIArgList for the
location operand, resulting in a significant change to its interface. This patch
does not update all IR passes to support multiple location operands in a
dbg.value; the only change is to update the DbgVariableIntrinsic interface and
its uses. All code outside of the intrinsic classes assumes that an intrinsic
will always have exactly one location operand; they will still support
DIArgLists, but only if they contain exactly one Value.

Among other changes, the setOperand and setArgOperand functions in
DbgVariableIntrinsic have been made private. This is to prevent code from
setting the operands of these intrinsics directly, which could easily result in
incorrect/invalid operands being set. This does not prevent these functions from
being called on a debug intrinsic at all, as they can still be called on any
CallInst pointer; it is assumed that any code directly setting the operands on a
generic call instruction is doing so safely. The intention for making these
functions private is to prevent DIArgLists from being overwritten by code that's
naively trying to replace one of the Values it points to, and also to fail fast
if a DbgVariableIntrinsic is updated to use a DIArgList without a valid
corresponding DIExpression.
2021-03-08 14:36:13 +00:00
Matt Arsenault c79a4490d4 OpaquePtr: Record byref types in bitcode writer
I missed this case when adding byref. I believe this is NFC until
pointee types are really removed.
2021-03-07 13:14:17 -05:00
gbtozers 65600cb2a7 [DebugInfo] Add DIArgList MD to store multple values in DbgVariableIntrinsics
This patch adds a new metadata node, DIArgList, which contains a list of SSA
values. This node is in many ways similar in function to the existing
ValueAsMetadata node, with the difference being that it tracks a list instead of
a single value. Internally, it uses ValueAsMetadata to track the individual
values, but there is also a reasonable amount of DIArgList-specific
value-tracking logic on top of that. Similar to ValueAsMetadata, it is a special
case in parsing and printing due to the fact that it requires a function state
(as it may reference function-local values).

This patch should not result in any immediate functional change; it allows for
DIArgLists to be parsed and printed, but debug variable intrinsics do not yet
recognize them as a valid argument (outside of parsing).

Differential Revision: https://reviews.llvm.org/D88175
2021-03-05 17:02:24 +00:00
Fangrui Song ed02f52d28 Fix unstable SmallPtrSet iteration issues due to collectUsedGlobalVariables
While here, decrease inline element counts from 8 to 4. See D97128 for the choice.

Depends on D97128 (which added a new SmallVecImpl overload for collectUsedGlobalVariables).

Reviewed By: tejohnson

Differential Revision: https://reviews.llvm.org/D97139
2021-02-23 16:09:05 -08:00
Leonard Chan 1c932baeaa [llvm][Bitcode] Add bitcode reader/writer for DSOLocalEquivalent
This is necessary for compilation with [thin]lto.

Differential Revision: https://reviews.llvm.org/D96170
2021-02-22 10:37:57 -08:00
Wei Wang 80dc0661bd [LTO] Perform DSOLocal propagation in combined index
Perform DSOLocal propagation within summary list of every GV. This
avoids the repeated query of this information during function
importing.

Differential Revision: https://reviews.llvm.org/D96398
2021-02-12 22:58:26 -08:00
James Y Knight db00953ff3 Fix bitcode decoder error in "Encode alignment attribute for `atomicrmw`"
The wrong record field number was being used in bitcode decoding,
which broke a self-hosted LTO build. (Yet, somehow, this _doesn't_
seem to have broken simple bitcode encode/decode roundtrip tests, and
I'm not sure why...)

Fixes commit d06ab79816
2021-02-11 22:29:03 -05:00
Guillaume Chatelet 17517f3178 Encode alignment attribute for `cmpxchg`
This is a follow up patch to D83136 adding the align attribute to `cmpxchg`.
See also D83465 for `atomicrmw`.

Differential Revision: https://reviews.llvm.org/D87443
2021-02-11 15:17:50 -05:00
Guillaume Chatelet d06ab79816 Encode alignment attribute for `atomicrmw`
This is a follow up patch to D83136 adding the align attribute to `atomicwmw`.

Differential Revision: https://reviews.llvm.org/D83465
2021-02-11 15:17:37 -05:00
Fangrui Song 54fb3ca96e [ThinLTO] Add Visibility bits to GlobalValueSummary::GVFlags
Imported functions and variable get the visibility from the module supplying the
definition.  However, non-imported definitions do not get the visibility from
(ELF) the most constraining visibility among all modules (Mach-O) the visibility
of the prevailing definition.

This patch

* adds visibility bits to GlobalValueSummary::GVFlags
* computes the result visibility and propagates it to all definitions

Protected/hidden can imply dso_local which can enable some optimizations (this
is stronger than GVFlags::DSOLocal because the implied dso_local can be
leveraged for ELF -shared while default visibility dso_local has to be cleared
for ELF -shared).

Note: we don't have summaries for declarations, so for ELF if a declaration has
the most constraining visibility, the result visibility may not be that one.

Differential Revision: https://reviews.llvm.org/D92900
2021-01-27 10:43:51 -08:00
Petr Hosek bb9eb19829 Support for instrumenting only selected files or functions
This change implements support for applying profile instrumentation
only to selected files or functions. The implementation uses the
sanitizer special case list format to select which files and functions
to instrument, and relies on the new noprofile IR attribute to exclude
functions from instrumentation.

Differential Revision: https://reviews.llvm.org/D94820
2021-01-26 17:13:34 -08:00
Petr Hosek 1e634f3952 Revert "Support for instrumenting only selected files or functions"
This reverts commit 4edf35f11a because
the test fails on Windows bots.
2021-01-26 12:25:28 -08:00
Petr Hosek 4edf35f11a Support for instrumenting only selected files or functions
This change implements support for applying profile instrumentation
only to selected files or functions. The implementation uses the
sanitizer special case list format to select which files and functions
to instrument, and relies on the new noprofile IR attribute to exclude
functions from instrumentation.

Differential Revision: https://reviews.llvm.org/D94820
2021-01-26 11:11:39 -08:00
Kazu Hirata 16baad8f4e [llvm] Use pop_back_val (NFC) 2021-01-24 12:18:57 -08:00
Kazu Hirata 352fcfc697 [llvm] Use llvm::sort (NFC) 2021-01-17 10:39:45 -08:00
Kazu Hirata 1d0bc05551 [llvm] Use llvm::append_range (NFC) 2021-01-06 18:27:33 -08:00
Luo, Yuanke 981a0bd858 [X86] Add x86_amx type for intel AMX.
The x86_amx is used for AMX intrisics. <256 x i32> is bitcast to x86_amx when
it is used by AMX intrinsics, and x86_amx is bitcast to <256 x i32> when it
is used by load/store instruction. So amx intrinsics only operate on type x86_amx.
It can help to separate amx intrinsics from llvm IR instructions (+-*/).
Thank Craig for the idea. This patch depend on https://reviews.llvm.org/D87981.

Differential Revision: https://reviews.llvm.org/D91927
2020-12-30 13:52:13 +08:00
Chih-Ping Chen 5f75dcf571 [DebugInfo] Support Fortran 'use <external module>' statement.
The main change is to add a 'IsDecl' field to DIModule so
that when IsDecl is set to true, the debug info entry generated
for the module would be marked as a declaration. That way, the debugger
would look up the definition of the module in the gloabl scope.

Please see the comments in llvm/test/DebugInfo/X86/dimodule.ll
for what the debug info entries would look like.

Differential Revision: https://reviews.llvm.org/D93462
2020-12-18 13:10:57 -05:00
Rong Xu 3733463dbb [IR][PGO] Add hot func attribute and use hot/cold attribute in func section
Clang FE currently has hot/cold function attribute. But we only have
cold function attribute in LLVM IR.

This patch adds support of hot function attribute to LLVM IR.  This
attribute will be used in setting function section prefix/suffix.
Currently .hot and .unlikely suffix only are added in PGO (Sample PGO)
compilation (through isFunctionHotInCallGraph and
isFunctionColdInCallGraph).

This patch changes the behavior. The new behavior is:
(1) If the user annotates a function as hot or isFunctionHotInCallGraph
    is true, this function will be marked as hot. Otherwise,
(2) If the user annotates a function as cold or
    isFunctionColdInCallGraph is true, this function will be marked as
    cold.

The changes are:
(1) user annotated function attribute will used in setting function
    section prefix/suffix.
(2) hot attribute overwrites profile count based hotness.
(3) profile count based hotness overwrite user annotated cold attribute.

The intention for these changes is to provide the user a way to mark
certain function as hot in cases where training input is hard to cover
all the hot functions.

Differential Revision: https://reviews.llvm.org/D92493
2020-12-17 18:41:12 -08:00
Gulfem Savrun Yeniceri 7c0e3a77bc [clang][IR] Add support for leaf attribute
This patch adds support for leaf attribute as an optimization hint
in Clang/LLVM.

Differential Revision: https://reviews.llvm.org/D90275
2020-12-14 14:48:17 -08:00
Fangrui Song b5ad32ef5c Migrate deprecated DebugLoc::get to DILocation::get
This migrates all LLVM (except Kaleidoscope and
CodeGen/StackProtector.cpp) DebugLoc::get to DILocation::get.

The CodeGen/StackProtector.cpp usage may have a nullptr Scope
and can trigger an assertion failure, so I don't migrate it.

Reviewed By: #debug-info, dblaikie

Differential Revision: https://reviews.llvm.org/D93087
2020-12-11 12:45:22 -08:00
Zhengyang Liu 75f50e15bf Adding PoisonValue for representing poison value explicitly in IR
Define ConstantData::PoisonValue.
Add support for poison value to LLLexer/LLParser/BitcodeReader/BitcodeWriter.
Add support for poison value to llvm-c interface.
Add support for poison value to OCaml binding.
Add m_Poison in PatternMatch.

Differential Revision: https://reviews.llvm.org/D71126
2020-11-25 17:33:51 -07:00
Nick Desaulniers f4c6080ab8 Revert "[IR] add fn attr for no_stack_protector; prevent inlining on mismatch"
This reverts commit b7926ce6d7.

Going with a simpler approach.
2020-11-17 17:27:14 -08:00
Nick Desaulniers dd6087cac0 Revert "[BitCode] decode nossp fn attr"
This reverts commit 0b11d018cc.

Going with a simpler approach.
2020-11-17 17:27:14 -08:00
serge-sans-paille 9218ff50f9 llvmbuildectomy - replace llvm-build by plain cmake
No longer rely on an external tool to build the llvm component layout.

Instead, leverage the existing `add_llvm_componentlibrary` cmake function and
introduce `add_llvm_component_group` to accurately describe component behavior.

These function store extra properties in the created targets. These properties
are processed once all components are defined to resolve library dependencies
and produce the header expected by llvm-config.

Differential Revision: https://reviews.llvm.org/D90848
2020-11-13 10:35:24 +01:00
Simon Pilgrim bd76f3724d [Bitcode] Make some basic PlaceholderQueue/MetadataLoaderImpl helper methods const. NFCI.
Fixes a number of cppcheck remarks.
2020-10-31 12:16:48 +00:00
Simon Pilgrim 5bf45ee156 BitcodeReader::popValue - pass SmallVectorImpl<> as const reference. NFCI.
Fixes cppcheck warning.
2020-10-30 14:33:19 +00:00
Mircea Trofin 735ab4be35 [ThinLTO] Fix .llvmcmd emission
llvm::EmbedBitcodeInModule needs (what used to be called) EmbedMarker
set, in order to emit .llvmcmd. EmbedMarker is really about embedding the
command line, so renamed the parameter accordingly, too.

This was not caught at test because the check-prefix was incorrect, but
FileCheck does not report that when multiple prefixes are provided. A
separate patch will address that.

Differential Revision: https://reviews.llvm.org/D90278
2020-10-28 17:45:30 -07:00
Alok Kumar Sharma a6dd01afa3 [DebugInfo] Support for DW_TAG_generic_subrange
This is needed to support fortran assumed rank arrays which
have runtime rank.

  Summary:
Fortran assumed rank arrays have dynamic rank. DWARF TAG
DW_TAG_generic_subrange is needed to support that.

  Testing:
unit test cases added (hand-written)
check llvm
check debug-info

Reviewed By: aprantl

Differential Revision: https://reviews.llvm.org/D89218
2020-10-29 01:34:15 +05:30
Mircea Trofin 6fa35541a0 [NFC][ThinLTO] Change command line passing to EmbedBitcodeInModule
Changing to pass by ref - less null checks to worry about.

Differential Revision: https://reviews.llvm.org/D90330
2020-10-28 12:33:39 -07:00
Nick Desaulniers 0b11d018cc [BitCode] decode nossp fn attr
I missed this in https://reviews.llvm.org/D87956.

Reviewed By: void

Differential Revision: https://reviews.llvm.org/D90177
2020-10-26 13:06:54 -07:00
Nick Desaulniers b7926ce6d7 [IR] add fn attr for no_stack_protector; prevent inlining on mismatch
It's currently ambiguous in IR whether the source language explicitly
did not want a stack a stack protector (in C, via function attribute
no_stack_protector) or doesn't care for any given function.

It's common for code that manipulates the stack via inline assembly or
that has to set up its own stack canary (such as the Linux kernel) would
like to avoid stack protectors in certain functions. In this case, we've
been bitten by numerous bugs where a callee with a stack protector is
inlined into an __attribute__((__no_stack_protector__)) caller, which
generally breaks the caller's assumptions about not having a stack
protector. LTO exacerbates the issue.

While developers can avoid this by putting all no_stack_protector
functions in one translation unit together and compiling those with
-fno-stack-protector, it's generally not very ergonomic or as
ergonomic as a function attribute, and still doesn't work for LTO. See also:
https://lore.kernel.org/linux-pm/20200915172658.1432732-1-rkir@google.com/
https://lore.kernel.org/lkml/20200918201436.2932360-30-samitolvanen@google.com/T/#u

Typically, when inlining a callee into a caller, the caller will be
upgraded in its level of stack protection (see adjustCallerSSPLevel()).
By adding an explicit attribute in the IR when the function attribute is
used in the source language, we can now identify such cases and prevent
inlining.  Block inlining when the callee and caller differ in the case that one
contains `nossp` when the other has `ssp`, `sspstrong`, or `sspreq`.

Fixes pr/47479.

Reviewed By: void

Differential Revision: https://reviews.llvm.org/D87956
2020-10-23 11:55:39 -07:00
David Stenberg 0c0fcea557 Handle value uses wrapped in metadata for the use-list order
When generating the use-list order, also consider value uses that are
operands which are wrapped in metadata; e.g. llvm.dbg.value operands.

This fixes PR36778. The test case is based on the reproducer from that
report.

Reviewed By: dexonsmith

Differential Revision: https://reviews.llvm.org/D53758
2020-10-20 20:05:59 +02:00
Atmn Patel 595c615606 [IR] Adds mustprogress as a LLVM IR attribute
This adds the LLVM IR attribute `mustprogress` as defined in LangRef through D86233. This attribute will be applied to functions with in languages like C++ where forward progress is guaranteed. Functions without this attribute are not required to make progress.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D85393
2020-10-20 03:09:57 -04:00
Matt Arsenault 0a7cd99a70 Reapply "OpaquePtr: Add type to sret attribute"
This reverts commit eb9f7c28e5.

Previously this was incorrectly handling linking of the contained
type, so this merges the fixes from D88973.
2020-10-16 11:05:02 -04:00
Craig Topper a184c758b7 [BitCodeAnalyzer] Add a few missing TYPE_CODES and MODULE_CODE_COMDAT to GetCodeName
Happened to notice some of these printing as UnknownCode while running llvm-bcanalyzer on a bc file I had.

Differential Revision: https://reviews.llvm.org/D86900
2020-10-12 15:43:12 -07:00
Teresa Johnson c27ab339ad Restore "[ThinLTO] Avoid temporaries when loading global decl attachment metadata"
This restores commit ab1b4810b5 which was
reverted in 01b9deba76, with a fix for the
issue it caused. We should use a temporary BitstreamCursor when
loading the global decl attachment records so that the abbrev ids held
in the lazy loading IndexCursor are not clobbered. Enhanced the test so
that the issue is exposed there.

Original description:

When performing ThinLTO importing, the metadata loader attempts to lazy
load, by building an index. However, module level global decl attachment
metadata was being parsed early while building the index, since the
associated (module level) global values aren't materialized on demand.
This results in the creation of forward reference temporary metadatas,
which are expensive.

Normally, these module level global values don't have much attached
metadata. However, in the case of -fwhole-program-vtables (e.g. for
whole program devirtualization), the vtables may have many attached type
metadatas. This was resulting in very slow performance when performing
ThinLTO importing with the default lazy loading.

This patch restructures the handling of these global decl attachment
records, delaying their parsing until after the lazy loading index has
been built. Then the parser can use the interface that loads from the
index, which resolves forward references immediately instead of creating
expensive temporaries.

For one ThinLTO backend that imports from modules containing huge
numbers of vtables and associated types, I measured the following
compile times for the metadata materialization during function
importing, rounded to nearest second:

No -fwhole-program-vtables:
  Lazy loading on (head):  1s
  Lazy loading off (head): 3s
  Lazy loading on (patch): 1s

With -fwhole-program-vtables:
  Lazy loading on (head):  440s
  Lazy loading off (head): 4s
  Lazy loading on (patch): 2s

Differential Revision: https://reviews.llvm.org/D87970
2020-10-12 10:11:56 -07:00
Alok Kumar Sharma 96bd4d34a2 [DebugInfo] Support for DWARF attribute DW_AT_rank
This patch adds support for DWARF attribute DW_AT_rank.

  Summary:
Fortran assumed rank arrays have dynamic rank. DWARF attribute
DW_AT_rank is needed to support that.

  Testing:
unit test cases added (hand-written)
check llvm
check debug-info

Reviewed By: aprantl

Differential Revision: https://reviews.llvm.org/D89141
2020-10-10 17:51:12 +05:30
Sam McCall b953a01b2c Reapply [ADT] function_ref's constructor is unavailable if the argument is not callable.
This reverts commit 281703e67f.

GCC 5.4 bugs are worked around by avoiding use of variable templates.

Differential Revision: https://reviews.llvm.org/D88977
2020-10-07 18:31:12 +02:00
Sam McCall 281703e67f Revert "[ADT] function_ref's constructor is unavailable if the argument is not callable."
This reverts commit 4cae6228d1.

Breaks GCC build:
http://lab.llvm.org:8011/#/builders/8/builds/33/steps/6/logs/stdio
2020-10-07 16:37:13 +02:00
Sam McCall 4cae6228d1 [ADT] function_ref's constructor is unavailable if the argument is not callable.
This allows overload sets containing function_ref arguments to work correctly
Otherwise they're ambiguous as anything "could be" converted to a function_ref.

This matches proposed std::function_ref, absl::function_ref, etc.

Differential Revision: https://reviews.llvm.org/D88901
2020-10-07 16:31:09 +02:00
Tres Popp eb9f7c28e5 Revert "OpaquePtr: Add type to sret attribute"
This reverts commit 55c4ff91bd.

Issues were introduced as discussed in https://reviews.llvm.org/D88241
where this change made previous bugs in the linker and BitCodeWriter
visible.
2020-09-29 10:31:04 +02:00
Matt Arsenault 55c4ff91bd OpaquePtr: Add type to sret attribute
Make the corresponding change that was made for byval in
b7141207a4. Like byval, this requires a
bulk update of the test IR tests to include the type before this can
be mandatory.
2020-09-25 14:07:30 -04:00
Fangrui Song 01b9deba76 Revert D87970 "[ThinLTO] Avoid temporaries when loading global decl attachment metadata"
This reverts commit ab1b4810b5.

It caused an issue in llvm::lto::thinBackend for a -fsanitize=cfi build.

```
AbbrevNo is 0 => "Invalid abbrev number"
0  llvm::BitstreamCursor::getAbbrev (this=0x9db4c8, AbbrevID=4) at llvm/include/llvm/Bitstream/BitstreamReader.h:528
1  0x00007f5f777a6eb4 in llvm::BitstreamCursor::readRecord (this=0x9db4c8, AbbrevID=4, Vals=llvm::SmallVector of Size 0, Capacity 64, Blob=0x7ffcd0e26558) at
usr/local/google/home/maskray/llvm/llvm/lib/Bitstream/Reader/BitstreamReader.cpp:228
2  0x00007f5f796bf633 in llvm::MetadataLoader::MetadataLoaderImpl::lazyLoadOneMetadata (this=0x9db3a0, ID=188, Placeholders=...) at /usr/local/google/home/mas
ray/llvm/llvm/lib/Bitcode/Reader/MetadataLoader.cpp:1091
3  0x00007f5f796c2527 in llvm::MetadataLoader::MetadataLoaderImpl::getMetadataFwdRefOrLoad (this=0x9db3a0, ID=188) at llvm
lib/Bitcode/Reader/MetadataLoader.cpp:668
4  0x00007f5f796bfff3 in llvm::MetadataLoader::getMetadataFwdRefOrLoad (this=0xd31580, Idx=188) at llvm/lib/Bitcode/Reader
MetadataLoader.cpp:2290
5  0x00007f5f79638265 in (anonymous namespace)::BitcodeReader::parseFunctionBody (this=0xd312e0, F=0x9de758) at llvm/lib/B
tcode/Reader/BitcodeReader.cpp:3938
6  0x00007f5f79635d32 in (anonymous namespace)::BitcodeReader::materialize (this=0xd312e0, GV=0x9de758) at llvm/lib/Bitcod
/Reader/BitcodeReader.cpp:5408
7  0x00007f5f7f8dbe3e in llvm::Module::materialize (this=0x9b92c0, GV=0x9de758) at llvm/lib/IR/Module.cpp:442
8  0x00007f5f7f7f8fbe in llvm::GlobalValue::materialize (this=0x9de758) at llvm/lib/IR/Globals.cpp:50
9  0x00007f5f83b9b5f5 in llvm::FunctionImporter::importFunctions (this=0x7ffcd0e2a730, DestModule=..., ImportList=...) at
llvm/lib/Transforms/IPO/FunctionImport.cpp:1182
```
2020-09-23 10:24:08 -07:00
Teresa Johnson ab1b4810b5 [ThinLTO] Avoid temporaries when loading global decl attachment metadata
When performing ThinLTO importing, the metadata loader attempts to lazy
load, by building an index. However, module level global decl attachment
metadata was being parsed early while building the index, since the
associated (module level) global values aren't materialized on demand.
This results in the creation of forward reference temporary metadatas,
which are expensive.

Normally, these module level global values don't have much attached
metadata. However, in the case of -fwhole-program-vtables (e.g. for
whole program devirtualization), the vtables may have many attached type
metadatas. This was resulting in very slow performance when performing
ThinLTO importing with the default lazy loading.

This patch restructures the handling of these global decl attachment
records, delaying their parsing until after the lazy loading index has
been built. Then the parser can use the interface that loads from the
index, which resolves forward references immediately instead of creating
expensive temporaries.

For one ThinLTO backend that imports from modules containing huge
numbers of vtables and associated types, I measured the following
compile times for the metadata materialization during function
importing, rounded to nearest second:

No -fwhole-program-vtables:
  Lazy loading on (head):  1s
  Lazy loading off (head): 3s
  Lazy loading on (patch): 1s

With -fwhole-program-vtables:
  Lazy loading on (head):  440s
  Lazy loading off (head): 4s
  Lazy loading on (patch): 2s

Differential Revision: https://reviews.llvm.org/D87970
2020-09-22 20:32:07 -07:00
Simon Pilgrim d566771779 ValueList.cpp - remove unnecessary includes. NFCI.
Already included in ValueList.h
2020-09-17 15:06:01 +01:00
Simon Pilgrim abe0d8551d MetadataLoader.cpp - remove unnecessary StringRef include. NFCI.
Already included in MetadataLoader.h
2020-09-17 13:18:54 +01:00
Jianzhou Zhao 11201315d5 Flush bitcode incrementally for LTO output
Bitcode writer does not flush buffer until the end by default. This is
fine to small bitcode files. When -flto,--plugin-opt=emit-llvm,-gmlt are
used, the final bitcode file is large, for example, >8G. Keeping all
data in memory consumes a lot of memory.

This change allows bitcode writer flush data to disk early when buffered
data size is above some threshold. This is only enabled when lld emits
LLVM bitcode.

One issue to address is backpatching bitcode: subblock length, function
body indexes, meta data indexes need to backfill. If buffer can be
flushed partially, we introduced raw_fd_stream that supports
read/seek/write, and enables backpatching bitcode flushed in disk.

Reviewed-by: tejohnson, MaskRay

Differential Revision: https://reviews.llvm.org/D86905
2020-09-17 03:32:31 +00:00
Simon Pilgrim c6a82fdbf2 ValueEnumerator.cpp - remove duplicate includes. NFCI.
Remove headers already included in ValueEnumerator.h
2020-09-16 18:32:28 +01:00
Mircea Trofin e543708e5e [NFC][ThinLTO] Let llvm::EmbedBitcodeInModule handle serialization.
llvm::EmbedBitcodeInModule handles serializing the passed-in module, if
the provided MemoryBufferRef is invalid. This is already the path taken
in one of the uses of the API - clang::EmbedBitcode, when called from
BackendConsumer::HandleTranslationUnit - so might as well do the same
here and reduce (by very little) code duplication.

The only difference this patch introduces is that the serialization happens
with ShouldPreserveUseListOrder set to true.

Differential Revision: https://reviews.llvm.org/D87339
2020-09-10 10:25:00 -07:00
Guillaume Chatelet 5a4a0cfcfb [NFC] Separate bitcode reading for FUNC_CODE_INST_CMPXCHG(_OLD)
This is preparatory work to unable storing alignment for AtomicCmpXchgInst.
See D83136 for context and bug: https://bugs.llvm.org/show_bug.cgi?id=27168

This is the fixed version of D83375, which was submitted and reverted.

Differential Revision: https://reviews.llvm.org/D87373
2020-09-09 19:10:30 +00:00
Fangrui Song 6ae7b403c3 Set alignment of .llvmbc and .llvmcmd to 1
Otherwise their alignment is dependent on the size of the section.  If the size
is large than 16, the alignment will be 16.

16 is a bad choice for both .llvmbc and .llvmcmd because the padding between two
contributions from input sections is of a variable size.

A bitstream is actually guaranteed to be 4-byte aligned, but consumers don't
need this property.
2020-08-29 18:27:34 -07:00
David Sherwood f4257c5832 [SVE] Make ElementCount members private
This patch changes ElementCount so that the Min and Scalable
members are now private and can only be accessed via the get
functions getKnownMinValue() and isScalable(). In addition I've
added some other member functions for more commonly used operations.
Hopefully this makes the class more useful and will reduce the
need for calling getKnownMinValue().

Differential Revision: https://reviews.llvm.org/D86065
2020-08-28 14:43:53 +01:00
Sourabh Singh Tomar f91d18eaa9 [DebugInfo][flang]Added support for representing Fortran assumed length strings
This patch adds support for representing Fortran `character(n)`.

Primarily patch is based out of D54114 with appropriate modifications.

Test case IR is generated using our downstream classic-flang. We're in process
of upstreaming flang PR's but classic-flang has dependencies on llvm, so
this has to get in first.

Patch includes functional test case for both IR and corresponding
dwarf, furthermore it has been manually tested as well using GDB.

Source snippet:
```
 program assumedLength
   call sub('Hello')
   call sub('Goodbye')
   contains
   subroutine sub(string)
           implicit none
           character(len=*), intent(in) :: string
           print *, string
   end subroutine sub
 end program assumedLength
```

GDB:
```
(gdb) ptype string
type = character (5)
(gdb) p string
$1 = 'Hello'
```

Reviewed By: aprantl, schweitz

Differential Revision: https://reviews.llvm.org/D86305
2020-08-22 10:13:40 +05:30
Vitaly Buka fc4fd89852 [StackSafety] Use ValueInfo in ParamAccess::Call
This avoid GUID lookup in Index.findSummaryInModule.
Follow up for D81242.

Reviewed By: tejohnson

Differential Revision: https://reviews.llvm.org/D85269
2020-08-14 12:42:44 -07:00
Kai Nacke b3aece0531 [SystemZ/ZOS] Add binary format goff and operating system zos to the triple
Adds the binary format goff and the operating system zos to the triple
class. goff is selected as default binary format if zos is choosen as
operating system. No further functionality is added.

Reviewers: efriedma, tahonermann, hubert.reinterpertcast, MaskRay

Reviewed By: efriedma, tahonermann, hubert.reinterpertcast

Differential Revision: https://reviews.llvm.org/D82081
2020-08-11 05:26:26 -04:00
Guillaume Chatelet 754deffd11 [NFC] Move BitcodeCommon.h from Bitstream to Bitcode 2020-07-27 20:49:17 +00:00
Guillaume Chatelet 1b4d24912a [NFC] Replace ".size() < 1" with ".empty()" 2020-07-27 13:54:53 +00:00
Guillaume Chatelet d9bbe85943 [Alignment][NFC] Update Bitcodewriter to use Align
Differential Revision: https://reviews.llvm.org/D83533
2020-07-27 08:16:45 +00:00
Steven Wu ac375c2fe3 [Bitcode] Avoid duplicating linker option when upgrading
Summary:
The upgrading path from old ModuleFlag based linker options to the new
NamedMetadata based linker option in in materializeMetadata() which gets
called once for the module and once for every GV. The linker options are
getting dup'ed every time and it can create massive amount of the linker
options in the object file that gets created from old bitcode. Fix the
problem by checking if the new option exists or not before upgrade
again.

rdar://64543389

Reviewers: pcc, t.p.northover, dexonsmith, arphaman

Reviewed By: arphaman

Subscribers: hiraditya, jkorous, ributzka, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D83688
2020-07-23 13:07:28 -07:00
Steven Wu 78709345fb [Bitcode] Drop invalid branch_weight in BitcodeReader
Summary:
If bitcode reader gets an invalid branch weight, drop that from the
inputs. This allows us to read the broken modules we generated before
the verifier was able to catch this.

rdar://64870641

Reviewers: yrouban, t.p.northover, dexonsmith, arphaman, aprantl

Reviewed By: aprantl

Subscribers: aprantl, hiraditya, jkorous, ributzka, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D83699
2020-07-23 09:07:15 -07:00
Alok Kumar Sharma 2d10258a31 [DebugInfo] Support for DW_AT_associated and DW_AT_allocated.
Summary:
This support is needed for the Fortran array variables with pointer/allocatable
attribute. This support enables debugger to identify the status of variable
whether that is currently allocated/associated.

  for pointer array (before allocation/association)
  without DW_AT_associated

(gdb) pt ptr
type = integer (140737345375288:140737354129776)
(gdb) p ptr
value requires 35017956 bytes, which is more than max-value-size

  with DW_AT_associated

(gdb) pt ptr
type = integer (:)
(gdb) p ptr
$1 = <not associated>

  for allocatable array (before allocation)

  without DW_AT_allocated

(gdb) pt arr
type = integer (140737345375288:140737354129776)
(gdb) p arr
value requires 35017956 bytes, which is more than max-value-size

  with DW_AT_allocated

(gdb) pt arr
type = integer, allocatable (:)
(gdb) p arr
$1 = <not allocated>

    Testing
- unit test cases added
- check-llvm
- check-debuginfo

Reviewed By: aprantl

Differential Revision: https://reviews.llvm.org/D83544
2020-07-20 19:54:35 +05:30
Matt Arsenault 5e999cbe8d IR: Define byref parameter attribute
This allows tracking the in-memory type of a pointer argument to a
function for ABI purposes. This is essentially a stripped down version
of byval to remove some of the stack-copy implications in its
definition.

This includes the base IR changes, and some tests for places where it
should be treated similarly to byval. Codegen support will be in a
future patch.

My original attempt at solving some of these problems was to repurpose
byval with a different address space from the stack. However, it is
technically permitted for the callee to introduce a write to the
argument, although nothing does this in reality. There is also talk of
removing and replacing the byval attribute, so a new attribute would
need to take its place anyway.

This is intended avoid some optimization issues with the current
handling of aggregate arguments, as well as fixes inflexibilty in how
frontends can specify the kernel ABI. The most honest representation
of the amdgpu_kernel convention is to expose all kernel arguments as
loads from constant memory. Today, these are raw, SSA Argument values
and codegen is responsible for turning these into loads.

Background:

There currently isn't a satisfactory way to represent how arguments
for the amdgpu_kernel calling convention are passed. In reality,
arguments are passed in a single, flat, constant memory buffer
implicitly passed to the function. It is also illegal to call this
function in the IR, and this is only ever invoked by a driver of some
kind.

It does not make sense to have a stack passed parameter in this
context as is implied by byval. It is never valid to write to the
kernel arguments, as this would corrupt the inputs seen by other
dispatches of the kernel. These argumets are also not in the same
address space as the stack, so a copy is needed to an alloca. From a
source C-like language, the kernel parameters are invisible.
Semantically, a copy is always required from the constant argument
memory to a mutable variable.

The current clang calling convention lowering emits raw values,
including aggregates into the function argument list, since using
byval would not make sense. This has some unfortunate consequences for
the optimizer. In the aggregate case, we end up with an aggregate
store to alloca, which both SROA and instcombine turn into a store of
each aggregate field. The optimizer never pieces this back together to
see that this is really just a copy from constant memory, so we end up
stuck with expensive stack usage.

This also means the backend dictates the alignment of arguments, and
arbitrarily picks the LLVM IR ABI type alignment. By allowing an
explicit alignment, frontends can make better decisions. For example,
there's real no advantage to an aligment higher than 4, so a frontend
could choose to compact the argument layout. Similarly, there is a
high penalty to using an alignment lower than 4, so a frontend could
opt into more padding for small arguments.

Another design consideration is when it is appropriate to expose the
fact that these arguments are all really passed in adjacent
memory. Currently we have a late IR optimization pass in codegen to
rewrite the kernel argument values into explicit loads to enable
vectorization. In most programs, unrelated argument loads can be
merged together. However, exposing this property directly from the
frontend has some disadvantages. We still need a way to track the
original argument sizes and alignments to report to the driver. I find
using some side-channel, metadata mechanism to track this
unappealing. If the kernel arguments were exposed as a single buffer
to begin with, alias analysis would be unaware that the padding bits
betewen arguments are meaningless. Another family of problems is there
are still some gaps in replacing all of the available parameter
attributes with metadata equivalents once lowered to loads.

The immediate plan is to start using this new attribute to handle all
aggregate argumets for kernels. Long term, it makes sense to migrate
all kernel arguments, including scalars, to be passed indirectly in
the same manner.

Additional context is in D79744.
2020-07-20 10:23:09 -04:00
Eric Christopher cc28058c13 Temporarily revert "[NFC] Separate bitcode reading for FUNC_CODE_INST_CMPXCHG(_OLD)"
as it wasn't NFC and is causing issues with thinlto bitcode reading.

I've followed up offline with reproduction instructions and testcases.

This reverts commit 30582457b4.
2020-07-10 15:21:00 -07:00
Guillaume Chatelet 30582457b4 [NFC] Separate bitcode reading for FUNC_CODE_INST_CMPXCHG(_OLD)
This is preparatory work to unable storing alignment for AtomicCmpXchgInst.
See D83136 for context and bug: https://bugs.llvm.org/show_bug.cgi?id=27168

Differential Revision: https://reviews.llvm.org/D83375
2020-07-10 04:27:39 +00:00
Gui Andrade ff7900d5de [LLVM] Accept `noundef` attribute in function definitions/calls
The `noundef` attribute indicates an argument or return value which
may never have an undef value representation.

This patch allows LLVM to parse the attribute.

Differential Revision: https://reviews.llvm.org/D83412
2020-07-08 19:02:04 +00:00