I was trying to pick this up a bit when reviewing D48426 (& perhaps D69778) - in any case, looks like D48426 added a module level flag that might not be needed.
The D48426 implementation worked by setting a module level flag, then code generating contents from the PCH a special case in ASTContext::DeclMustBeEmitted would be used to delay emitting the definition of these functions if they came from a Module with this flag.
This strategy is similar to the one initially implemented for modular codegen that was removed in D29901 in favor of the modular decls list and a bit on each decl to specify whether it's homed to a module.
One major difference between PCH object support and modular code generation, other than the specific list of decls that are homed, is the compilation model: MSVC PCH modules are built into the object file for some other source file (when compiling that source file /Yc is specified to say "this compilation is where the PCH is homed"), whereas modular code generation invokes a separate compilation for the PCH alone. So the current modular code generation test of to decide if a decl should be emitted "is the module where this decl is serialized the current main file" has to be extended (as Lubos did in D69778) to also test the command line flag -building-pch-with-obj.
Otherwise the whole thing is basically streamlined down to the modular code generation path.
This even offers one extra material improvement compared to the existing divergent implementation: Homed functions are not emitted into object files that use the pch. Instead at -O0 they are not emitted into the IR at all, and at -O1 they are emitted using available_externally (existing functionality implemented for modular code generation). The pch-codegen test has been updated to reflect this new behavior.
[If possible: I'd love it if we could not have the extra MSVC-style way of accessing dllexport-pch-homing, and just do it the modular codegen way, but I understand that it might be a limitation of existing build systems. @hans / @thakis: Do either of you know if it'd be practical to move to something more similar to .pcm handling, where the pch itself is passed to the compilation, rather than homed as a side effect of compiling some other source file?]
Reviewers: llunak, hans
Differential Revision: https://reviews.llvm.org/D83652
Summary:
When a constant array of empty strings goes through contant folding, the result
is something that contains no bytes. If this array is passed to the intrinsic
function `RESHAPE()`, we were not handling things correctly. I fixed this by
checking for an empty destination when calling the function `CopyFrom()` on an
array of strings.
I also added a test with a couple of different examples that trigger the
problem.
Reviewers: klausler, tskeith, DavidTruby
Subscribers: llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D84352
We want to be sure that atomic<size_t> is always lock-free, or the code
will be much slower than expected (and could even conceivably fail if
the lock implementation somehow calls back into libc++abi).
This allows simplifying the implementation of barriers.
This is a re-commit of 1ac403bd14, which had to be reverted in
64a9c944fc because the minimum CMake version wasn't high enough.
Now that we've upgraded, we can do this.
Differential Revision: https://reviews.llvm.org/D75243
The implementation of the xvtlsbb builtins/intrinsics were not correct as the
intrinsics previously used i1 as an argument type. This patch changes the i1
argument type used in these intrinsics to be i32 instead, as having the second
as an i1 can lead to issues in the backend.
Differential Revision: https://reviews.llvm.org/D84291
After lots of follow-up fixes, there are still problems, such as
-Wno-suggest-override getting passed to the Windows Resource Compiler
because it was added with add_definitions in the CMake file.
Rather than piling on another fix, let's revert so this can be re-landed
when there's a proper fix.
This reverts commit 21c0b4c1e8.
This reverts commit 81d68ad27b.
This reverts commit a361aa5249.
This reverts commit fa42b7cf29.
This reverts commit 955f87f947.
This reverts commit 8b16e45f66.
This reverts commit 308a127a38.
This reverts commit 274b6b0c7a.
This reverts commit 1c7037a2a5.
We can just use the definition from config.h. This means we need to move
a few lines around in CMakeLists.txt - the TF_AOT detection needs to be
before the spot we process the config.h.cmake files.
Differential Revision: https://reviews.llvm.org/D84349
There's no reason to involve the hassle of a virtual method targets
have to override for a simple boolean.
Not sure exactly what's going on with Mips, but it seems to define its
own totally separate handler classes.
This implements OpenMP runtime support for the OpenMP TR8 `present`
map type modifier. The previous patch in this series implements Clang
front end support. See that patch summary for behaviors that are not
yet supported.
Reviewed By: grokos, jdoerfert
Differential Revision: https://reviews.llvm.org/D83062
The underlying infrastructure supports this already, just add the
pattern matching for linalg.generic.
Differential Revision: https://reviews.llvm.org/D84335
If we use the default of None, we get a python exception in
find_and_diagnose_missing() instead of printing a sensible error message.
Reviewed By: #libc, ldionne
Differential Revision: https://reviews.llvm.org/D84342
This was structured in a way that implied every split argument is in
memory, or in registers. It is possible to pass an original argument
partially in registers, and partially in memory. Transpose the logic
here to only consider a single piece at a time. Every individual
CCValAssign should be treated independently, and any merge to original
value needs to be handled later.
This is in preparation for merging some preprocessing hacks in the
AMDGPU calling convention lowering into the generic code.
I'm also not sure what the correct behavior for memlocs where the
promoted size is larger than the original value. I've opted to clamp
the memory access size to not exceed the value register to avoid the
explicit trunc/extend/vector widen/vector extract instruction. This
happens for AMDGPU for i8 arguments that end up stack passed, which
are promoted to i16 (I think this is a preexisting DAG bug though, and
they should not really be promoted when in memory).
For now, xdrrec_create is only intercepted Linux as its signature
is different on Solaris.
The method of intercepting xdrrec_create isn't super ideal but I
couldn't think of a way around it: Using an AddrHashMap combined
with wrapping the userdata field.
We can't just allocate a handle on the heap in xdrrec_create and leave
it at that, since there'd be no way to free it later. This is because it
doesn't seem to be possible to access handle from the XDR struct, which
is the only argument to xdr_destroy.
On the other hand, the callbacks don't have a way to get at the
x_private field of XDR, which is what I chose for the HashMap key. So we
need to wrap the handle parameter of the callbacks. But we can't just
pass x_private as handle (as it hasn't been set yet). We can't put the
wrapper struct into the HashMap and pass its pointer as handle, as the
key we need (x_private again) hasn't been set yet.
So I allocate the wrapper struct on the heap, pass its pointer as
handle, and put it into the HashMap so xdr_destroy can find it later and
destroy it.
Differential Revision: https://reviews.llvm.org/D83358
Otherwise if 'ld' is an older system LLD (FreeBSD; or if someone adds 'ld' to
point to an LLD from a different installation) which does not support the
current ModuleSummaryIndex::BitCodeSummaryVersion, the test will fail.
Add lit feature 'binutils_lto'. GNU ld is more common than GNU gold, so
we can just require 'is_binutils_lto_supported' to additionally support GNU ld.
Reviewed By: myhsu
Differential Revision: https://reviews.llvm.org/D84133
Implementing new functionality tested in this file requires adding new
tests for many IR addressing patterns, which can be a large
maintenance burden. This patch makes adding tests easier by switching
to using autogenerated checks. This patch also removes the testing
mode that has simd128 disabled because it would produce very large
checks and is not particularly interesting.
Differential Revision: https://reviews.llvm.org/D84288
(This reverts commit a5e0194709, and
corrects author).
Rename the pass to be able to extend it to function properties other than inliner features.
Reviewed By: mtrofin
Differential Revision: https://reviews.llvm.org/D82044
A previous patch added -Wsuggest-override using a simple add_flag_if_supported(). This causes lots of warnings in LLVM when building with older GCC versions (< 9.2) which suggest adding override to functions that are only marked final. The current flags in both GCC >=9.2 and Clang accept plain final as equivalent to override final.
This patch adds logic to detect versions of -Wsuggest-override that warn on void foo() final and disables them to avoid warning spam in builds using older GCC's. This has the added minor benefit of getting rid of the useless C_SUPPORTS_SUGGEST_OVERRIDE_FLAG CMake cache variable which was set by add_flag_if_supported().
Differential Revision: https://reviews.llvm.org/D84292
- Remove the spurious argument to `CommandObjectScript`.
- Use make_shared instead of bare `new`.
- Move code duplication behind a macro.
Differential revision: https://reviews.llvm.org/D84336
These calls are neither intercepted by compiler-rt nor is libatomic.a
naturally instrumented.
This patch uses the existing libcall mechanism to detect a call
to atomic_load or atomic_store, and instruments them much like
the preexisting instrumentation for atomics.
Calls to _load are modified to have at least Acquire ordering, and
calls to _store at least Release ordering. Because this needs to be
converted at runtime, msan injects a LUT (implemented as a vector
with extractelement).
Differential Revision: https://reviews.llvm.org/D83337
Given a vecreduce.add(select(p, x, 0)), we can convert that to a
predicated vaddv, as the else value for the select is the identity
value, a zero. That is what this patch does for the vaddv, vaddva,
vaddlv and vaddlva instructions, copying the existing patterns to also
handle predication through a select.
Differential Revision: https://reviews.llvm.org/D84101
Summary:
This patch implements semantics for the 'arm_sve_vector_bits' type
attribute, defined by the Arm C Language Extensions (ACLE) for SVE [1].
The purpose of this attribute is to define fixed-length (VLST) versions
of existing sizeless types (VLAT).
Implemented in this patch is the the behaviour described in section 3.7.3.2
and minimal parts of sections 3.7.3.3 and 3.7.3.4, this includes:
* Defining VLST globals, structs, unions, and local variables
* Implicit casting between VLAT <=> VLST.
* Diagnosis of ill-formed conditional expressions of the form:
C ? E1 : E2
where E1 is a VLAT type and E2 is a VLST, or vice-versa. This
avoids any ambiguity about the nature of the result type (i.e is
it sized or sizeless).
* For vectors:
* sizeof(VLST) == N/8
* alignof(VLST) == 16
* For predicates:
* sizeof(VLST) == N/64
* alignof(VLST) == 2
VLSTs have the same representation as VLATs in the AST but are wrapped
with a TypeAttribute. Scalable types are currently emitted in the IR for
uses such as globals and structs which don't support these types, this
is addressed in the next patch with codegen, where VLSTs are lowered to
sized arrays for globals, structs / unions and arrays.
Not implemented in this patch is the behaviour guarded by the feature
macros:
* __ARM_FEATURE_SVE_VECTOR_OPERATORS
* __ARM_FEATURE_SVE_PREDICATE_OPERATORS
As such, the GNU __attribute__((vector_size)) extension is not available
and operators such as binary '+' are not supported for VLSTs. Support
for this is intended to be addressed by later patches.
[1] https://developer.arm.com/documentation/100987/latest
This is patch 2/4 of a patch series.
Reviewers: sdesmalen, rsandifo-arm, efriedma, cameron.mcinally, ctetreau, rengolin, aaron.ballman
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D83551
Rename the pass to be able to extend it to function properties other than inliner features.
Reviewed By: mtrofin
Differential Revision: https://reviews.llvm.org/D82044
This patch has no effect for C and C++. In more dynamic languages,
such as Objective-C and Swift GetByteSize() needs to call into the
language runtime, so it's important to pass one in where possible. My
primary motivation for this is some work I'm doing on the Swift
branch, however, it looks like we are also seeing warnings in
Objective-C that this may resolve. Everything in the SymbolFile
hierarchy still passes in nullptrs, because we don't have an execution
context in SymbolFile, since SymbolFile transcends processes.
Differential Revision: https://reviews.llvm.org/D84267
Explain why you can only get a cached analysis result, not compute one
on the fly.
Reviewed By: asbirlea
Differential Revision: https://reviews.llvm.org/D84259