The archive library truncated the size of archive members whose size was
greater than max uint32_t. This patch fixes the issue and adds some unit
tests to verify.
Reviewed by: ruiu, MaskRay, grimar, rupprecht
Differential Revision: https://reviews.llvm.org/D75742
Refines the gather/scatter cost model, but also changes the TTI
function getIntrinsicInstrCost to accept an additional parameter
which is needed for the gather/scatter cost evaluation.
This did require trivial changes in some non-ARM backends to
adopt the new parameter.
Extending gathers and truncating scatters are now priced cheaper.
Differential Revision: https://reviews.llvm.org/D75525
Summary:
Instead of generating two i32 instructions for each load or store of a volatile
i64 value (two LDRs or STRs), now emit LDRD/STRD.
These improvements cover architectures implementing ARMv5TE or Thumb-2.
The code generation explicitly deviates from using the register-offset
variant of LDRD/STRD. In this variant, the register allocated to the
register-offset cannot be reused in any of the remaining operands. Such
restriction seems to be non-trivial to implement in LLVM, thus it is
left as a to-do.
Reviewers: dmgreen, efriedma, john.brawn, nickdesaulniers
Reviewed By: efriedma, nickdesaulniers
Subscribers: danielkiss, alanphipps, hans, nathanchance, nickdesaulniers, vvereschaka, kristof.beyls, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D70072
Behavior of IEEEFloat::roundToIntegral is aligned with IEEE-754
operation roundToIntegralExact. In partucular this function now:
- returns opInvalid for signaling NaNs,
- returns opInexact if the result of rounding differs from argument.
Differential Revision: https://reviews.llvm.org/D75246
If the loaded memory size was smaller than the result size, this would
produce out of bounds memory accesses. I'm wondering if we need a
distinct narrow memory legalize action type, since a case I care about
is decomposing a 4-byte unaligned access into 4 extending loads, which
would leave the original result register type. I'm currently awkwardly
using narrowScalar to handle unaligned accesses that need to be split.
We were seeing non-deterministic binary size differences depending on which
toolchain was used to build fuchsia. This is because libunwind embeded the
FILE path into a logging macro, even for release builds, which makes the code
dependent on the build directory.
This removes the file and line number from the error message. This is
consistent with how other runtimes report error, e.g.
https://github.com/llvm/llvm-project/blob/master/libcxxabi/src/abort_message.cpp#L30.
Differential Revision: https://reviews.llvm.org/D75890
Summary:
Enables JIT-linking by RuntimeDyld of COFF objects that contain references to
dllimport symbols. This is done by recognizing symbols that start with the
reserved "__imp_" prefix and building a pointer entry to the target symbol in
the stubs area of the section. References to the "__imp_" symbol are updated to
point to this pointer.
Work in progress: The generic code is in place, but only RuntimeDyldCOFFX86_64
and RuntimeDyldCOFFI386 have been updated to look for and update references to
dllimport symbols.
Reviewers: compnerd
Subscribers: hiraditya, ributzka, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D75884
This patch allows rtdyld-check / jitlink-check expressions to be extended over
multiple lines by terminating each line with a '\'. E.g.
# llvm-rtdyld: *{8}X = \
# llvm-rtdyld: Y
X:
.quad Y
This will be used to break up some long lines in upcoming test cases.
Summary:
1-bit integer is tricky in different dialects sometimes. E.g., there is no
arithmetic instructions on 1-bit integer in SPIR-V, i.e., `spv.IMul %0, %1 : i1`
is not valid. Instead, `spv.LogicalAnd %0, %1 : i1` is valid. Creating the op
directly makes lowering easier because we don't need to match a complicated
pattern like `!(!lhs && !rhs)`. Also, this matches the semantic better.
Also add assertions on inputs.
Differential Revision: https://reviews.llvm.org/D75764
Summary:
The return type of __cxa_finalize is documented as void in the Itanium
C++ ABI, and it is void in various C libraries.
Reviewers: EricWF, ldionne, compnerd, mclow.lists, MaskRay
Reviewed By: MaskRay
Subscribers: MaskRay, dexonsmith, ldionne, libcxx-commits
Tags: #libc
Differential Revision: https://reviews.llvm.org/D75795
Due to Clang bug http://llvm.org/PR45151, deprecated attributes are not
picked up on partial specializations. This patch instead applies it to
the first declaration of std::function itself.
a dependent context.
This matches the GCC behavior.
We track the enclosing template depth when determining whether a
statement expression is within a dependent context; there doesn't appear
to be any other reliable way to determine this.
We previously assumed they were neither value- nor
instantiation-dependent under any circumstances, which would lead to
crashes and other misbehavior.
This avoids regressions in a future patch. I'm confused by the use of
the gfx9 usage legacy_mad. Was this a pointless instruction rename, or
uses fmul_legacy handling? Why is regular mac avilable in that case?
We would assign the incorrect DeclContext when transforming the RequiresExprBodyDecl, causing incorrect
handling of 'this' inside RequiresExprBodyDecls (bug #45162).
Assign the current context as the DeclContext of the transformed decl.
Fix a bug in IRGen where it wasn't destructing compound literals in C
that are ObjC pointer arrays or non-trivial structs. Also diagnose jumps
that enter or exit the lifetime of the compound literals.
rdar://problem/51867864
Differential Revision: https://reviews.llvm.org/D64464
Summary:
Outputs from an asm goto block cannot be used on the indirect branch.
It's not supported and may result in invalid code generation.
Reviewers: jyknight, nickdesaulniers, hfinkel
Reviewed By: nickdesaulniers
Subscribers: martong, cfe-commits, rnk, craig.topper, hiraditya, rsmith
Tags: #clang
Differential Revision: https://reviews.llvm.org/D71314
Summary:
These functions only use the FormatStyle to obtain a LangOptions via
format::getFormattingLangOpts(), and some callers can more easily obtain
a LangOptions more directly.
Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, kadircet, usaxena95, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D75716
Summary:
Created a general check for restrict-system-includes under portability as recommend in the comments under D75332. I also fleshed out the user facing documentation to show examples for common use-cases such as allow-list, block-list, and wild carding.
Removed fuchsia's check as per phosek sugguestion.
Reviewers: aaron.ballman, phosek, alexfh, hokein, njames93
Reviewed By: phosek
Subscribers: Eugene.Zelenko, mgorny, xazax.hun, phosek, cfe-commits, MaskRay
Tags: #clang-tools-extra, #clang
Differential Revision: https://reviews.llvm.org/D75786
Summary:
This patch add some builtin operation for the gpu.all_reduce ops.
- for Integer only: `and`, `or`, `xor`
- for Float and Integer: `min`, `max`
This is useful for higher level dialect like OpenACC or OpenMP that can lower to the GPU dialect.
Differential Revision: https://reviews.llvm.org/D75766
isSameEntity was missing constraints checking, causing constrained overloads
to not travel well accross serialization. (bug #45115)
Add constraints checking to isSameEntity.
* Adds GpuLaunchFuncToVulkanLaunchFunc conversion pass.
* Moves a serialization of the `spirv::Module` from LaunchFuncToVulkanCalls pass to newly created pass.
* Updates LaunchFuncToVulkanCalls instrumentation pass, adds `initVulkan` and `deinitVulkan` runtime calls.
* Adds `bindResource` call to bind specifc resource by the given descriptor set and descriptor binding.
* Eliminates static construction and desctruction of `VulkanRuntimeManager`.
Differential Revision: https://reviews.llvm.org/D75192
Summary:
Interfaces/ is the designated directory for these types of interfaces, and also removes the need for including them directly in IR/.
Differential Revision: https://reviews.llvm.org/D75886
The interfaces themselves aren't really analyses, they may be used by analyses though. Having them in Analysis can also create cyclic dependencies if an analysis depends on a specific dialect, that also provides one of the interfaces.
Differential Revision: https://reviews.llvm.org/D75867
Summary:
As far as I can tell on gfx10 conversions to/from f32 (that are not
converting f32 to/from f64) are full rate instructions, but they were
marked as quarter rate instructions.
I have fixed this for gfx10 only. I assume the scheduling model was
correct for older architectures, though I don't have any documentation
handy to confirm that.
Reviewers: rampitec, arsenm
Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, kerbowa, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D75392
This revision takes advantage of the empty AffineMap to specify the
0-D edge case. This allows removing a bunch of annoying corner cases
that ended up impacting users of Linalg.
Differential Revision: https://reviews.llvm.org/D75831
This essentially reverts some of the SimplifyLibcalls part changes of D45736 [SimplifyLibcalls] Replace locked IO with unlocked IO.
C11 7.21.5.2 The fflush function
> If stream is a null pointer, the fflush function performs this flushing action on all streams for which the behavior is defined above.
i.e. fopen'ed FILE* is inherently captured.
POSIX.1-2017 getc_unlocked, getchar_unlocked, putc_unlocked, putchar_unlocked - stdio with explicit client locking
> These functions can safely be used in a multi-threaded program if and only if they are called while the invoking thread owns the ( FILE *) object, as is the case after a successful call to the flockfile() or ftrylockfile() functions.
After a thread fopen'ed a FILE*, when it is calling foobar() which is now replaced by foobar_unlocked(),
if another thread is concurrently calling fflush(0), the behavior is undefined.
C11 7.22.4.4 The exit function
> Next, all open streams with unwritten buffered data are flushed, all open streams are closed, and all files created by the tmpfile function are removed.
The replacement is only feasible if the program is single threaded, or exit or fflush(0) is never called.
See also http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20180528/556615.html
for how the replacement makes libc interceptors difficult to implement.
dalias: in a worst case, it's unbounded data corruption because of concurrent access to pointers
without synchronization. f->wpos or rpos could get outside of the buffer, thread A could do
f->wpos += j after knowing j is in bounds, while thread B also changes it concurrently.
This can produce exploitable conditions depending on libc internals.
Revert the SimplifyLibcalls part change because the cons obviously
overweigh the pros. Even when the replacement is feasible, the benefit
is indemonstrable, more so in an application instead of an artificial
glibc benchmark. Theoretically the replacement could be beneficial when
calling getc_unlocked/putc_unlocked in a loop, but then it is better
using a blocked IO operation and the user is likely aware of that.
The function attribute inference is still useful and thus kept.
Reviewed By: xbolva00
Differential Revision: https://reviews.llvm.org/D75933