Sharing the FileManager between the importer and the module build should
only be an optimization. Add a cc1 option -fno-modules-share-filemanager
to allow us to test this. Fix the path to modulemap files, which
previously depended on the shared FileManager when using path mapped to
an external file in a VFS.
Differential Revision: https://reviews.llvm.org/D131076
In 6a79e2ff19 we changed Filemanager::getEntryRef() to return the
redirecting FileEntryRef instead of looking through the redirection.
This commit fixes the case when looking up a cached file path to also
return the redirecting FileEntryRef. This mainly affects the behaviour
of calling getNameAsRequested() on the resulting entry ref.
Differential Revision: https://reviews.llvm.org/D131273
This fixes a bug reported privately by @craig.topper. Here's an example which illustrates the problem:
vsetivli a1, a0, e32, m1, ta, mu # both DefInfo and PrevInfo
vsetivli a2, a1, e32, m4, ta, mu
With the unsound result being:
vsetivli a1, a0, e32, m1, ta, mu
vsetivli a2, a0, e32, m4, ta, mu
Consider the case where this is running on a machine with VLEN=512,. For this case, the VLMAXs are 16 and 64 respectively.
Consider for a0 = 33. The correct result is: a1 = 16, and a2 = 16
After the unsound optimization: a1 = 16 and a2 = 33
This particular example used VLMAXs which differed by more than a power of two. With a difference of only one power of two, there's another form of this bug which involves the AVL < 2 x VLMAX special case, but that ones more complicated to construct as many examples turn out accidentally sound.
This patch takes the approach of simply removing the unsound optimization, but there are multiple sound sub-cases of it. I plan to return to at least a couple of them, but figured it was cleaner to remove the unsound optimization (for ease of backporting), and then review the new optimizations on their own.
Differential Revision: https://reviews.llvm.org/D131264
Create function segments and emit unwind info of them.
A segment must be less than 1MB and no prolog or epilog is splitted between two
segments.
This patch should generate correct, though not optimal, unwind info for large
functions. Currently it only generate pacted info (.pdata) only for functions
that are less than 1MB (single-segment functions). This is NFC from before this
patch.
The next step is to enable (.pdata) only unwind info for the first segment or
segments that have neither prolog or epilog in a multi-segment function.
Another future work item is to further split segments that require more than 255
code words or have more than 65535 epilogs.
Reference:
https://docs.microsoft.com/en-us/cpp/build/arm64-exception-handling#function-fragments
Differential Revision: https://reviews.llvm.org/D130049
This commit addresses concerns raised in D129497.
Propagate lowering options from driver to expressions lowering
via AbstractConverter instance. A single use case so far is
using optimized TRANSPOSE lowering with O1/O2/O3.
bbc does not support optimization level switches, so it uses
default LoweringOptions (e.g. optimized TRANSPOSE lowering
is enabled by default, but an engineering -opt-transpose=false
option can still override this).
Differential Revision: https://reviews.llvm.org/D130204
Jason noted that the stop message we print for a memory high water mark
notification (EXC_RESOURCE) could be clearer. Currently, the stop
reason looks like this:
* thread #3, queue = 'com.apple.CFNetwork.LoaderQ', stop reason =
EXC_RESOURCE RESOURCE_TYPE_MEMORY (limit=14 MB, unused=0x0)
It's hard to read the message because the exception and the type
(EXC_RESOURCE RESOURCE_TYPE_MEMORY) blend together. Additionally, the
"observed=0x0" should not be printed for memory limit exceptions.
I wanted to continue to include the resource type from
<kern/exc_resource.h> while also explaining what it actually is. I used
the wording from the comments in the header. With this path, the stop
reason now looks like this:
* thread #5, stop reason = EXC_RESOURCE (RESOURCE_TYPE_MEMORY: high
watermark memory limit exceeded) (limit=14 MB)
rdar://40466897
Differential revision: https://reviews.llvm.org/D131130
This design is borrowed from the lldb folks (thank you!) to declutter
the page.
* The version number at the top is removed.
* Links are pushed over to a sidebar
* The sidebar has headings
There are other minor changes:
* The warning about this project not being ready is now an RST "warning"
* Links to the Bug Reports and the Source Code are Added
* Refer to this project as either "The LLVM C LIbrary" or "The libc"
Tested:
Built locally
Reviewed By: sivachandra
Differential Revision: https://reviews.llvm.org/D131242
This commit combines the initial commit (7c240de609af), a fix for x86_64 Linux
(3a0581501e76) and a fix for thinko in a last minute rewrite that I really
should have run the testsuite on.
Also, make sure that all the "I need to step over watchpoint" plans execute
before we call a public stop. Otherwise, e.g. if you have N watchpoints and
a Signal, the signal stop info will get us to stop with the watchpoints in a
half-done state.
Differential Revision: https://reviews.llvm.org/D130674
MemRef types now can carry an attribute to represent the memory
space. Still, upper layers in the compilation stack mostly use
nuemric values. They don't mean much (other than differentiating
separate memory domains) in MLIR's multi-level settings. Those
numeric memory space inside MemRef types need to be translated
into concrete SPIR-V storage classes during lowering to pin down
to concrete memory types.
Thus far we have been hardcoding an arbitrary mapping from memory
space to storage class for converting MemRef types. This works fine
for only targeting Vulkan; it falls apart if we want to target other
SPIR-V consumers like OpenCL, as different consumers might want
different storage classes for the buffer/variable of the same
lifetime. For example, StorageClass in Vulkan vs. CrossWorkgroup
in OpenCL.
So putting up a new pass to let the user to control how to map
MemRef memory spaces into SPIR-V storage classes. This provides
more flexibility and can address the awkwardness in the current
SPIR-V type converter. This pass should be the prelimiary step
towards lowering MemRef related types/ops into SPIR-V.
Reviewed By: mravishankar
Differential Revision: https://reviews.llvm.org/D130317
NOTE: i8 vector splats are ignored because the immediate range of
DUP already has full coverage.
Differential Revision: https://reviews.llvm.org/D131078
Correct a bug in the code that resets shadow memory introduced as part
of a previous change for the Go race detector (D128909). The bug was
that only the most recently added shadow segment was being reset, as
opposed to the entire extent of the segment created so far. This
fixes a bug identified in Google internal testing (b/240733951).
Differential Revision: https://reviews.llvm.org/D131256
We get a couple of improvements from recognizing swapped
operand patterns that were not handled by the replicated
code.
This should also enable simplifying larger patterns as
seen in issue #56653 and issue #56654, but that requires
enhancements to isImpliedCondition() itself.
As mentioned in https://reviews.llvm.org/D121379#3690593, this
change broke the build of compiler-rt targeting powerpc using GCC.
The 32-bit powerpc target is not supposed to emit 128-bit libcalls
-- if it does, then that's a backend bug and needs to be fixed there.
This reverts commit 8f24a56a3a.
Differential Revision: https://reviews.llvm.org/D130988
There are no AMDGPUSampleVariant versions for _G16, it is treated more like a
modifier for derivatives (_D) (also for intrinsics where it is overloaded type
instead of part of instrinsic name) so we ended up making more variants for
these instruction then we actually needed.
32-bit derivatives need 6 dwords at most, while 16-bit need 4 at most. Using
same AMDGPUSampleVariant for both, we ended up creating 2 extra variants per
instruction than were necessary.
In total this deletes 260 unused tablegen records.
Differential Revision: https://reviews.llvm.org/D131252
Given a poison constant as input, the dyn_cast to a ConstantInt would
fail so we would fall through to the generic code that attempts to fold
each element of the input vectors. The inputs to these intrinsics are
not vectors though, leading to a compile time crash. Instead bail out
properly for poison values by returning nullptr. This doesn't try to
define what poison means for these intrinsics.
Fixes#56945
This fixes 69 llvm tests that failed when EXPENSIVE_CHECKS was enabled.
llvm/test/Transforms/IROutliner/outlining-commutative-operands-opposite-order.ll
is one example.
When we have EXPENSIVE_CHECKS, _GLIBCXX_DEBUG is defined. This means
that libstdc++ will call the compare function to check if it is
implemented correctly (that !(a < a) is true).
This happens even if there is only one item and here, we expect
to see one return void or multiple return constant integer.
Don't sort if we have 1 item, but do assert that it is the 1
ret void we expect. In the comparator, assert that neither
Value is a nullptr in case one ended up in a the list somehow.
Reviewed By: AndrewLitteken
Differential Revision: https://reviews.llvm.org/D130230
In the top-level llvm `CMakeLists.txt`, we need to call
`find_package(Python3)` *before* including `config-ix.cmake`, otherwise
the latter will not be able to successfully search for python modules
using `find_python_module()`. Also set `LLVM_MINIMUM_PYTHON_VERSION`
before calling `find_package(Python3)`, moving it to `CMakeLists.txt`
from `HandleLLVMOptions.cmake`.
Reviewed By: compnerd
Differential Revision: https://reviews.llvm.org/D131191
Support for functions wmempcpy, wmemmove, wmemcmp is added to the checker.
The same tests are copied that exist for the non-wide versions, with
non-wide functions and character types changed to the wide version.
Reviewed By: martong
Differential Revision: https://reviews.llvm.org/D130470
In 6e566bc552, The directory structure of the documentation for clang-tidy checks was changed, however clangd wasn't updated.
Now all the links generated will point to old dead pages.
This updated clangd to use the new page structure.
Reviewed By: sammccall, kadircet
Differential Revision: https://reviews.llvm.org/D128379
According to the description of the LoongArch abi documentation,
(https://loongson.github.io/LoongArch-Documentation/LoongArch-ELF-ABI-EN.html)
the calling convention of LoongArch is almost the same as the RISCV's
(except for the vector part), so we borrow the implementation of RISCV.
This patch only guarantees the correctness of lp64d, because only the
part of lp64d is described in detail in the documentation.
Differential Revision: https://reviews.llvm.org/D130249
Closing https://github.com/llvm/llvm-project/issues/56919
It is meaningless to preserve the lifetime markers for the spilled
allocas in the coroutine frames and it would block some optimizations
too.
This is a simple addition to emitConditionalComparison, to match CCMP
with immediates using getIConstantVRegValWithLookThrough, letting it
select the CCMPri variants of the instructions.
Differential Revision: https://reviews.llvm.org/D131073
Consider:
A == 5 && A != 5
IfA is 5, the old collectConjunctionTerms() would call itself again for
the LHS (which it ignores), then the RHS (which it also ignores) and
then just return without ever adding anything to the Terms array.
Differential Revision: https://reviews.llvm.org/D131070
This makes sure the assertions also get verified in optimized builds.
This matches what is already done in bad_unwind_info.pass.cpp.
Reviewed By: #libunwind, MaskRay
Differential Revision: https://reviews.llvm.org/D131210
This class has only the minimum functionality in it to provide what the
TZ variable parsing needs. In particular, the standard makes guarantees
about how trivial the destructors are, throws an expception if it's used
incorrectly, etc. There are also missing features.
Tested:
Trivial testsuite added, and use in development.
Reviewed By: gchatelet
Differential Revision: https://reviews.llvm.org/D129920
This patch adds constant folder for Atan2Op which only supports single and double precision floating-point.
Differential Revision: https://reviews.llvm.org/D131050
`BMI` new instruction `tzcnt` has better performance than `bsf` on new
processors. Its encoding has a mandatory prefix '0xf3' compared to
`bsf`. If we force emit `rep` prefix for `bsf`, we will gain better
performance when the same code run on new processors.
GCC has already done this way: https://c.godbolt.org/z/6xere6fs1Fixes#34191
Reviewed By: craig.topper, skan
Differential Revision: https://reviews.llvm.org/D130956
Add a new "kernel" section with following schema.
```
"kernel": {
"loadAddress"?: decimal | hex string | string decimal
# This is optional. If it's not specified, use default address 0xffffffff81000000.
"file": string
# path to the kernel image
}
```
Here's more details of the diff:
- If "kernel" section exist, it means current tracing mode is //KernelMode//.
- If tracing mode is //KernelMode//, the "processes" section must be empty and the "kernel" and "cpus" section must be provided. This is tested with `TestTraceLoad`.
- "kernel" section is parsed and turned into a new process with a single module which is the kernel image. The kernel process has N fake threads, one for each cpu.
Reviewed By: wallace
Differential Revision: https://reviews.llvm.org/D130805
In D130807 we added the `skipprofile` attribute. This commit
changes the format so we can either `forbid` or `skip` profiling
functions by adding the `noprofile` or `skipprofile` attributes,
respectively. The behavior of the original format remains
unchanged.
Also, add the `skipprofile` attribute when using
`-fprofile-function-groups`.
This was originally landed as https://reviews.llvm.org/D130808 but was
reverted due to a Windows test failure.
Differential Revision: https://reviews.llvm.org/D131195