Use unique_ptr to achieve the effect of mutable.
Remove mutable keyword of DynRefCount and HoldRefCount
Remove std::shared_ptr from UpdateMtx
Reviewed By: tianshilei1992, grokos
Differential Revision: https://reviews.llvm.org/D109007
Per the contract of ReadLateParsedTemplates, we should not be returning
the same results multiple times. No functionality change intended, other
than to runtime.
Thanks to Luboš Luňák for identifying the cause of the regression!
There is a separate field `isPragmaOnce` and when `isImport` combines
both, it complicates HeaderFileInfo serialization as `#pragma once` is
the inherent property of the header while `isImport` reflects how other
headers use it. The usage of the header can be different in different
contexts, that's why `isImport` requires tracking separate from `#pragma once`.
Differential Revision: https://reviews.llvm.org/D104351
This is a case I'd missed in 6a8237. The odd bit here is that missing the edge removal update seems to produce MemorySSA which verifies, but is still corrupt in a way which bothers following passes. I wasn't able to reduce a single pass test case, which is why the reported test case is taken as is.
Differential Revision: https://reviews.llvm.org/D109068
(1) renamed SparseTensor to SparseTensorCOO, the other one remains SparseTensorStorage to focus on contrast
(2) documents difference between public API exclusively for compiler-generated code and methods that could be used by other runtimes (TBD) that want to interact with MLIR
Reviewed By: bixia
Differential Revision: https://reviews.llvm.org/D109039
Add method to get NameLoc. Treat null child location as unknown to avoid
needing to create UnknownLoc in C API where child loc is not needed.
Differential Revision: https://reviews.llvm.org/D108678
I believe, the profitability reasoning here is correct
"sub"reg is already located within the 0'th subreg of wider reg,
so if we have suvector insertion at index 0 into undef,
then it's always free do to.
After this, D109065 finally avoids the regression in D108382.
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D109074
Similar to D97585.
D25456 used `S_ATTR_LIVE_SUPPORT` to ensure the data variable will be retained
or discarded as a unit with the counter variable, so llvm.compiler.used is
sufficient. It allows ld to dead strip unneeded profc and profd variables.
Reviewed By: vsk
Differential Revision: https://reviews.llvm.org/D105445
This adds lowering of the llvm.fptosi.sat and llvm.fptoui.sat intinsics,
selecting a VCVT instruction which under MVE will inherently perform the
saturate.
Differential Revision: https://reviews.llvm.org/D107865
As started in D107925, this patch replaces the remaining occurrences
of `UNIFIED_SHARED_MEMORY && TgtPtrBegin == HstPtrBegin` in
`omptarget.cpp` with `IsHostPtr`. The former condition is broken in
the rare case that the device and host happen to use the same address
for their mapped allocations. I don't know how to write a test that's
likely to reveal this case.
Reviewed By: grokos
Differential Revision: https://reviews.llvm.org/D107928
As discussed in D105990, without this patch, `targetDataBegin`
determines whether to transfer data (as opposed to assuming it's in
shared memory) using the condition `!UseUSM || HasCloseModifier`.
However, this condition is broken if use of discrete memory was forced
by `omp_target_associate_ptr`. This patch extends
`unified_shared_memory/associate_ptr.c` to reveal this case, and it
fixes it using `!IsHostPtr` in `DeviceTy::getTargetPointer` to replace
this condition.
Reviewed By: grokos
Differential Revision: https://reviews.llvm.org/D107927
This patch is based on comments in D105990. It is NFC according to
the following observations:
1. `CopyMember` is computed as `!IsHostPtr && IsLast`.
2. `DelEntry` is true only if `IsLast` is true.
We apply those observations in order:
```
if ((DelEntry || Always || CopyMember) && !IsHostPtr)
if ((DelEntry || Always || IsLast) && !IsHostPtr)
if ((Always || IsLast) && !IsHostPtr)
```
Reviewed By: grokos
Differential Revision: https://reviews.llvm.org/D107926
As discussed in D105990, without this patch, `targetDataEnd`
determines whether to transfer data or delete a device mapping (as
opposed to assuming it's in shared memory) using two different
conditions, each of which is broken for some cases:
1. `!(UNIFIED_SHARED_MEMORY && TgtPtrBegin == HstPtrBegin)`: The
broken case is rare: the device and host might happen to use the
same address for their mapped allocations. I don't know how to
write a test that's likely to reveal this case, but this patch does
fix it, as discussed below.
2. `!UNIFIED_SHARED_MEMORY || HasCloseModifier`: There are at least
two broken cases:
1. The `close` modifier might have been specified on an `omp
target enter data` but not the corresponding `omp target exit
data`, which thus might falsely assume a mapping is in shared
memory. The test `unified_shared_memory/close_enter_exit.c`
already has a missing deletion as a result, and this patch adds
a check for that. This patch also adds the new test
`close_member.c` to reveal a missing transfer and deletion.
2. Use of discrete memory might have been forced by
`omp_target_associate_ptr`, as in the test
`unified_shared_memory/api.c`. In the current `targetDataEnd`
implementation, this condition turns out not be used for this
case: because the reference count is infinite, a transfer is
possible only with an `always` modifier, and this condition is
never used in that case. To ensure it's never used for that
case in the future, this patch adds the test
`unified_shared_memory/associate_ptr.c`.
Fortunately, `DeviceTy::getTgtPtrBegin` already has a solution: it
reports whether the allocation was found in shared memory via the
variable `IsHostPtr`.
After this patch, `HasCloseModifier` is no longer used in
`targetDataEnd`, and I wonder if the `close` modifier is ever useful
on an `omp target data end`.
Reviewed By: grokos
Differential Revision: https://reviews.llvm.org/D107925
There's a silent bug in our reasoning about zero strides. We assume that having a single static exit implies that if that exit is not taken, then the loop must be infinite. This ignores the potential for abnormal exits via exceptions. Consider the following example:
for (uint_8 i = 0; i < 1; i += 0) {
throw_on_thousandth_call();
}
Our reasoning is such that we'd conclude this loop can't take the backedge as that would lead to a (presumed) infinite loop.
In practice, this is a silent bug because the loopIsFiniteByAssumption returns false strictly more often than the loopHaNoAbnormalExits property. We could reasonable want to change that in the future, so fixing the codeflow now is worthwhile.
Differential Revision: https://reviews.llvm.org/D109029
Currently, the truncate selection dag node is expanded as a bitwise AND plus compare to 1. This change enables scalar comparison in the pattern if the truncate node is uniform.
Reviewed By: rampitec
Differential Revision: https://reviews.llvm.org/D108925
As mentioned in D108833, the logic for figuring out if a backedge is dead was somewhat interwoven with the SCEV based logic and the symbolic eval logic. This is my attempt at making the code easier to follow.
Note that this is only NFC after the work done in 29fa37ec. Thanks to Nikita for catching that case.
Differential Revision: https://reviews.llvm.org/D108848
The commandline flag to specify a particular openmp devicertl library
currently errors like:
```
fatal error: cannot open file
'./runtimes/runtimes-bins/openmp/libomptarget':
Is a directory
```
CommonArgs successfully appends the directory to the commandline args then
mlink-builtin-bitcode rejects it.
This patch is a point fix to that. If --libomptarget-amdgcn-bc-path=directory
then append the expected name for the current architecture and go on as before.
This is useful for test runners that don't hardcode the architecture.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D109057
With opaque pointers, no actual bitcasts will be present. Instead,
there will be a mismatch between the call FunctionType and the
function ValueType. Change the code to collect CallBases
specifically (rather than general Uses) and compare these types.
RAUW is no longer performed, as there would no longer be any
bitcasts that can be RAUWd.
Differential Revision: https://reviews.llvm.org/D108880
This patch is to add Image Operands in SPIR-V Dialect and also let ImageDrefGather to use Image Operands.
Image Operands are used in many image instructions. "Image Operands encodes what oprands follow, as per Image Operands". And ususally, they are optional to image instructions.
The format of image operands looks like:
%0 = spv.ImageXXXX %1, ... %3 : f32 ["Bias|Lod"](%4, %5 : f32, f32) -> ...
This patch doesn’t implement all operands (see Section 3.14 in SPIR-V Spec) but provides a skeleton of it. There is TODO in verifyImageOperands function.
Co-authored: Alan Liu <alanliu.yf@gmail.com>
Reviewed by: antiagainst
Differential Revision: https://reviews.llvm.org/D108501
We'd special cased this logic to use pointer types for non-integral pointers, but there's no reason we can't do that for all pointer types. Doing it this was has a few advantages:
a) The code itself becomes more straight forward, and easier to test.
b) We avoid introducing ptrtoint into programs which didn't have them in the source.
c) The resulting codegen is easier to analyze and simplify (mostly due to lack of ptrtoint).
Note that there are some test diffs, but a) running them through instcombine helps a ton, and b) there's enough missing obvious transforms on both before and after IR that it's clear this isn't performance sensitive.
This is mostly motivated by cleaning up mentions of non-integrals to have a clearer idea of what we actually need to support.
Differential Revision: https://reviews.llvm.org/D104662
Document the CSR AGPRs for GFX90A.
Remove the TODO for gfx908, as the answer is that we don't mark any
AGPRs as callee-saved except for GFX90A, i.e. the docs as-is are correct
for gfx908.
Reviewed By: rampitec
Differential Revision: https://reviews.llvm.org/D109009
Document 'use-external-name' and the various bits of logic that make it
work, to avoid others having to repeat the archival work (given that I
added getFileRefReturnsCorrectNameForDifferentStatPath to
FileManagerTest, seems possible I understood this once before!).
- b59cf679e8 added 'use-external-name' to
RedirectingFileSystem. This causes `stat`s to return the external
name for a redirected file instead of the name it was accessed by,
leaking it through the VFS.
- d066d4c849 propagated the external name
further through clang::FileManager.
- 4dc5573acc, which added
clang::FileEntryRef to clang::FileManager, has complicated concession
to account for this as well (since refactored a bit).
The goal of 'use-external-name' is to enable Clang to report "real" file
paths to users (via diagnostics) and to external tools (such as
debuggers reading debug info and build systems reading `.d` files).
I've added FIXMEs to look at other channels for communicating the
external names, since the current implementation adds complexity to
FileManager and exposes an inconsistent interface to clients.
Besides that, the FileManager logic appears to be kicking in outside of
'use-external-name'. Seems that *some* vfs::FileSystem implementations
canonicalize some paths returned by `stat` in *some* cases (the bug
isn't fully understood yet). Volodymyr Sapsai is investigating, this at
least better documents what *is* understood.
This patch adds a skeleton as a preparatory step for the next patch which
adds the actual implementations of the condition variable functions.
Reviewed By: michaelrj
Differential Revision: https://reviews.llvm.org/D108947