This patch diagnoses invalid references of global host variables in device,
global, or host device functions.
Differential Revision: https://reviews.llvm.org/D91281
In C++ when a reference variable is captured by copy, the lambda
is supposed to make a copy of the referenced variable in the captures
and refer to the copy in the lambda. Therefore, it is valid to capture
a reference to a host global variable in a device lambda since the
device lambda will refer to the copy of the host global variable instead
of access the host global variable directly.
However, clang tries to avoid capturing of reference to a host global variable
if it determines the use of the reference variable in the lambda function is
not odr-use. Clang also tries to emit load of the reference to a global variable
as load of the global variable if it determines that the reference variable is
a compile-time constant.
For a device lambda to capture a reference variable to host global variable
and use the captured value, clang needs to be taught that in such cases the use of the reference
variable is odr-use and the reference variable is not compile-time constant.
This patch fixes that.
Differential Revision: https://reviews.llvm.org/D91088
This is yet another attempt at providing support for epilogue
vectorization following discussions raised in RFC http://llvm.1065342.n5.nabble.com/llvm-dev-Proposal-RFC-Epilog-loop-vectorization-tt106322.html#none
and reviews D30247 and D88819.
Similar to D88819, this patch achieve epilogue vectorization by
executing a single vplan twice: once on the main loop and a second
time on the epilogue loop (using a different VF). However it's able
to handle more loops, and generates more optimal control flow for
cases where the trip count is too small to execute any code in vector
form.
Reviewed By: SjoerdMeijer
Differential Revision: https://reviews.llvm.org/D89566
The author of "https://reviews.llvm.org/D92428" marked
'resize_tls_dynamic.cpp' with XFAIL for powerpc64 since
it fails on a bunch of PowerPC buildbots. However, the
original test case passes on clang-ppc64le-rhel bot. So
marking this as XFAIL makes this bot to fail as the test
case passes unexpectedly. We are marking this unsupported
on all PowerPC64 for now until it is fixed for all the
PowerPC buildbots.
Given that OpState already implicit converts to Operator*, this seems reasonable.
The alternative would be to add more functions to OpState which forward to Operation.
Reviewed By: rriddle, ftynse
Differential Revision: https://reviews.llvm.org/D92266
This might be a small improvement in readability, but the
real motivation is to make it easier to adapt the code to
deal with intrinsics like 'maxnum' and/or integer min/max.
There is potentially help in doing that with D92086, but
we might also just add specialized wrappers here to deal
with the expected patterns.
OpenMPIRBuilder::createParallel outlines the body region of the parallel
construct into a new function that accepts any value previously defined outside
the region as a function argument. This function is called back by OpenMP
runtime function __kmpc_fork_call, which expects trailing arguments to be
pointers. If the region uses a value that is not of a pointer type, e.g. a
struct, the produced code would be invalid. In such cases, make createParallel
emit IR that stores the value on stack and pass the pointer to the outlined
function instead. The outlined function then loads the value back and uses as
normal.
Reviewed By: jdoerfert, llitchev
Differential Revision: https://reviews.llvm.org/D92189
This also removes the empty extra "module asm" that would be created,
and updates the test to reflect that while making it more explicit.
Broken out from https://reviews.llvm.org/D92335
This patch consists of the addition of some common additional
extended mnemonics to the SystemZ target.
- These are jnop, jct, jctg, jas, jasl, jxh, jxhg, jxle,
jxleg, bru, brul, br*, br*l.
- These mnemonics and the instructions they map to are
defined here, Chapter 4 - Branching with extended
mnemonic codes.
- Except for jnop (which is a variant of brc 0, label), every
other mnemonic is marked as a MnemonicAlias since there is
already a "defined" instruction with the same encoding
and/or condition mask values.
- brc 0, label doesn't have a defined extended mnemonic, thus
jnop is defined using as an InstAlias. Furthermore, the
applyMnemonicAliases function is called in the overridden
parseInstruction function in SystemZAsmParser.cpp to ensure
any mnemonic aliases are applied before any further
processing on the instruction is done.
Reviewed By: uweigand
Differential Revision: https://reviews.llvm.org/D92185
In this patch I have added support for a new loop hint called
vectorize.scalable.enable that says whether we should enable scalable
vectorization or not. If a user wants to instruct the compiler to
vectorize a loop with scalable vectors they can now do this as
follows:
br i1 %exitcond, label %for.end, label %for.body, !llvm.loop !2
...
!2 = !{!2, !3, !4}
!3 = !{!"llvm.loop.vectorize.width", i32 8}
!4 = !{!"llvm.loop.vectorize.scalable.enable", i1 true}
Setting the hint to false simply reverts the behaviour back to the
default, using fixed width vectors.
Differential Revision: https://reviews.llvm.org/D88962
The code that gets the ScriptInterpreter was not considering the
case that it receives a Lua interpreter.
Differential Revision: https://reviews.llvm.org/D92249
This implementation of `ELFDumper<ELFT>::printAttributes()` in llvm-readobj has issues:
1) It crashes when the content of the attribute section is empty.
2) It uses `unwrapOrError` and `reportWarning` calls, though
ideally we want to use `reportUniqueWarning`.
3) It contains a TODO about redundant format version check.
`lib/Support/ELFAttributeParser.cpp` uses a hardcoded constant instead of the named constant.
This patch fixes all these issues.
Differential revision: https://reviews.llvm.org/D92318
We currently reject all templates that have either zero args or that have a
parameter pack without a name. Both cases are actually allowed in C++, so
rejecting them leads to LLDB instead falling back to a dummy 'void' type. This
leads to all kind of errors later on (most notable, variables that have such
template types appear to be missing as we can't have 'void' variables and
inheriting from such a template type will cause Clang to hit some asserts when
finding that the base class is 'void').
This just removes the too strict tests and adds a few tests for this stuff (+
some combinations of these tests with preceding template parameters).
Things that I left for follow-up patches:
* All the possible interactions with template-template arguments which seem like a whole new source of possible bugs.
* Function templates which completely lack sanity checks.
* Variable templates are not implemented.
* Alias templates are not implemented too.
* The rather strange checks that just make sure that the separate list of
template arg names and values always have the same length. I believe those
ought to be asserts, but my current plan is to move both those things into a
single list that can't end up in this inconsistent state.
Reviewed By: JDevlieghere, shafik
Differential Revision: https://reviews.llvm.org/D92425
These were re-added by fbfb1c7909 but
should not have been. This removes the old experimental versions of the
reduction intrinsics again, leaving the new non experimental ones.
Differential Revision: https://reviews.llvm.org/D92411
In lowering of FLT_ROUNDS_, FPSCR content will be moved into FP register
and then GPR, and then truncated into word.
For subtargets without direct move support, it will store and then load.
The load address needs adjustment (+4) only on big-endian targets. This
patch fixes it on using generic opcodes on little-endian and subtargets
with direct-move.
Reviewed By: steven.zhang
Differential Revision: https://reviews.llvm.org/D91845
This:
1) Changes `reportWarning` to `reportUniqueWarning` (no-op here).
2) Adds more context to the message.
3) Merges `broken-dynsym-link.test` into `dyn-symbols.test`, adds more testing.
Differential revision: https://reviews.llvm.org/D92380
Commit 6b1341eb fixed alignment for 128-bit FP types on PowerPC.
However, the quadword alignment adjustment shouldn't be applied to IBM
extended double (ppc_fp128 in IR) values.
Reviewed By: jsji
Differential Revision: https://reviews.llvm.org/D92278
This reverts a side effect introduced in the code cleanup patch D43571:
LLD started to emit empty output sections that are explicitly assigned to a segment.
This patch fixes the issue by removing the !sec.phdrs.empty() special case from
isDiscardable. As compensation, we add an early phdrs propagation step (see the inline comment).
This is similar to one that we do in adjustSectionsAfterSorting.
Differential revision: https://reviews.llvm.org/D92301
Also, add notes about exporting ABI symbols.
Later, we can add notes about using git-clang-format before sending a patch for review.
Reviewed By: ldionne, #libc
Differential Revision: https://reviews.llvm.org/D92300
i1 is the native type for PowerPC if crbits is enabled. However, we need
to promote the i1 to i64 as we didn't have the pattern for i1.
Reviewed By: Qiu Chao Fang
Differential Revision: https://reviews.llvm.org/D92067
Also, for .o files, include full path as given on link command line.
Before:
lld: error: undefined symbol [...], referenced from sandbox_logging.o
After:
lld: error: undefined symbol [...], referenced from libseatbelt.a(sandbox_logging.o)
Move archiveName up to InputFile so we can consistently use toString()
to print InputFiles in diags, and pass it to the ObjFile ctor. This
matches the ELF and COFF ports.
Differential Revision: https://reviews.llvm.org/D92437
This adds missing `select` instruction support and block return type
support for reference types. Also refactors WebAssemblyInstrRef.td and
rearranges tests in reference-types.s. Tests don't include `exnref`
types, because we currently don't support `exnref` for `ref.null` and
the type will be removed soon anyway.
Reviewed By: tlively, sbc100, wingo
Differential Revision: https://reviews.llvm.org/D92359
The static_assert in "libcxx/include/memory" was the main offender here,
but then I figured I might as well `git grep -i instantat` and fix all
the instances I found. One was in user-facing HTML documentation;
the rest were in comments or tests.
I used a lot of `git grep` to find places where `std::` was being used
outside of comments and assert-messages. There were three outcomes:
- Qualified function calls, e.g. `std::move` becomes `_VSTD::move`.
This is the most common case.
- Typenames that don't need qualification, e.g. `std::allocator` becomes `allocator`.
Leaving these as `_VSTD::allocator` would also be fine, but I decided
that removing the qualification is more consistent with existing practice.
- Names that specifically need un-versioned `std::` qualification,
or that I wasn't sure about. For example, I didn't touch any code in
<atomic>, <math.h>, <new>, or any ext/ or experimental/ headers;
and I didn't touch any instances of `std::type_info`.
In some deduction guides, we were accidentally using `class Alloc = typename std::allocator<T>`,
despite `std::allocator<T>`'s type-ness not being template-dependent.
Because `std::allocator` is a qualified name, this did parse as we intended;
but what we meant was simply `class Alloc = allocator<T>`.
Differential Revision: https://reviews.llvm.org/D92250
Since we know exactly which identifiers we expect to find in `chrono`,
a using-directive seems like massive overkill. Remove the directives
and qualify the names as needed.
One subtle trick here: In two places I replaced `*__p` with `*__p.get()`.
The former is an unqualified call to `operator*` on a class type, which
triggers ADL and breaks the new test. The latter is a call to the
built-in `operator*` on pointers, which specifically
does NOT trigger ADL thanks to [over.match.oper]/1.
Differential Revision: https://reviews.llvm.org/D92243