Currently, -fdeclare-opencl-builtins always adds the generic address
space overloads of e.g. the vload builtin functions in OpenCL 3.0
mode, even when the generic address space feature is disabled.
Guard the generic address space overloads by the
`__opencl_c_generic_address_space` feature instead of by OpenCL
version.
Guard the private, global, and local overloads using the internal
`__opencl_c_named_address_space_builtins` feature.
Differential Revision: https://reviews.llvm.org/D107769
The current cost-model overestimates the cost of vector compares &
selects for ordered floating point compares. This patch fixes that by
extending the existing logic for integer predicates.
Reviewed By: dmgreen
Differential Revision: https://reviews.llvm.org/D118256
This reduces the dependencies of the MLIRVector target and makes the dialect consistent with other dialects.
Differential Revision: https://reviews.llvm.org/D118533
Do not warn on reserved identifiers resulting from expansion of system macros.
Also properly test -Wreserved-identifier wrt. system headers.
Should fix#49592
Differential Revision: https://reviews.llvm.org/D118532
Based on the output of include-what you-use.
Most notably, llvm/Remarks/Remark.h is no longer automatically included by
llvm/Remarks/RemarkParser.h, so client code may need to include explicitly.
clang++ -E -Iinclude -I../llvm/include ../llvm/lib/Remarks/*.cpp -std=c++14 -fno-rtti -fno-exceptions | wc -l
before: 770253
after: 759347
Related discourse thread: https://llvm.discourse.group/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D118506
Based on the output of include-what-you-use.
It's an utility directory, so no much impact on other code areas.
clang++ -E -Iinclude -I../llvm/include ../llvm/utils/TableGen/*.cpp -std=c++14 -fno-rtti -fno-exceptions | wc -l
before: 4327274
after: 4316190
Related discourse thread: https://llvm.discourse.group/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D118466
Display notes for a possible call chain if an unsafe function is found to be
called (maybe indirectly) from a signal handler.
The call chain displayed this way includes probably not the first calls of
the functions, but it is a valid possible (in non path-sensitive way) one.
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D118224
This reverts commit 36892727e4 which
breaks the build when LLVM_INSTALL_TOOLCHAIN_ONLY is enabled with:
CMake Error at cmake/modules/AddLLVM.cmake:683 (add_dependencies):
The dependency target "clang-tidy-headers" of target "CTTestTidyModule"
does not exist.
The pruning cloner already tries to remove unreachable blocks. The
original cloning process will simplify instructions and constant
terminators, and only clone blocks that are reachable at that point.
However, phi nodes can only be simplified after everything has been
cloned. For that reason, additional blocks may become unreachable
after phi simplification.
The code does try to handle this as well, but only removes blocks
that don't have predecessors. It misses unreachable cycles. This
can cause issues if SEH exception handling code is part of an
unreachable cycle, as the inliner is not prepared to deal with that.
This patch instead performs an explicit scan for reachable blocks,
and drops everything else.
Fixes https://github.com/llvm/llvm-project/issues/53206.
Differential Revision: https://reviews.llvm.org/D118449
masked.atomicrmw.*.i32 intrinsics access an i32 (and then possibly
mask it), so hardcode MVT::i32 as the access type here, rather than
determining it from the pointer element type.
Differential Revision: https://reviews.llvm.org/D118336
Following Sanjay's proposal from discussion in D118317, this patch
generalizes and-reduce handling to fold the following pattern
```
icmp ne (bitcast(icmp ne (lhs, rhs)), 0)
```
into
```
icmp ne (bitcast(lhs), bitcast(rhs))
```
https://alive2.llvm.org/ce/z/WDcuJ_
Differential Revision: https://reviews.llvm.org/D118431
Reviewed By: lebedev.ri
Support affine.load/store ops in fold-memref-subview ops pass. The
existing pass just "inlines" the subview operation on load/stores by
inserting affine.apply ops in front of the memref load/store ops: this
is by design always consistent with the semantics on affine.load/store
ops and the same would work even more naturally/intuitively with the
latter.
Differential Revision: https://reviews.llvm.org/D118565
This is a slight change because I'm using the ANY_EXTEND result
instead of the original operand, but getNode should constant fold.
While there, add a comment about why the code specifically checks
for a ConstantSDNode.
Update SCF pass cmd line names to prefix `scf`. This is consistent with
guidelines/convention on how to name dialect passes. This also avoids
ambiguity on the context given the multiple `for` operations in the
tree.
NFC.
Differential Revision: https://reviews.llvm.org/D118564
Based on the existing naming "only" tests are used for rv32 instructions
that don't exist in rv64. rv32 tests without "only" are for instructions
that are in both rv32 and rv64. The rv64 tests are for instructions
that are only in rv64.
Both of these test files have instruction encodings that are only
valid in rv64 so they can be the same file.
"pack t0, t1, zero" disassembles to "pack t0, t1, zero" with Zbkb
not "zext.h t0, t1"
Part of the test was using a CHECK prefix that doesn't appear on
the RUN line.
Previously an InputSectionBase is dead (`partition==0`) by default.
SyntheticSection calls markLive and BssSection overrides that with markDead.
It is more natural to make InputSectionBase live by default and let
--gc-sections mark InputSectionBase dead.
When linking a Release build of clang:
* --no-gc-sections:, the removed `inputSections` loop decreases markLive time from 4ms to 1ms.
* --gc-sections: the extra `inputSections` loop increases markLive time from 0.181296s to 0.188526s.
This is as of we lose the removing one `inputSections` loop optimization (4374824ccf).
I believe the loss can be mitigated if we refactor markLive.
If AllocationOrder has less than 32 elements, we were treating the extra
positions as if they were valid. This was detected by a subsequent
assert. The fix also tightens the asserts.
Both IDFCalculatorBase and its accompanying DominatorTreeBase only supports pointer nodes. The template argument is the block type itself and any uses of GraphTraits is therefore done via a pointer to the node type.
However, the ChildrenGetterTy type of IDFCalculatorBase has a use on just the node type instead of a pointer to the node type. Various parts of the monorepo has worked around this issue by providing specializations of GraphTraits for the node type directly, or not been affected by using specializations instead of the generic case. These are unnecessary however and instead the generic code should be fixed instead.
An example from within Tree is eg. A use of IDFCalculatorBase in InstrRefBasedImpl.cpp. It basically instantiates a IDFCalculatorBase<MachineBasicBlock, false> but due to the bug above then goes on to specialize GraphTraits<MachineBasicBlock> although GraphTraits<MachineBasicBlock*> exists (and should be used instead).
Similar dead code exists in clang which defines redundant GraphTraits to work around this bug.
This patch fixes both the original issue and removes the dead code that was used to work around the issue.
Differential Revision: https://reviews.llvm.org/D118386
We can use the RISCVISD::GREV encoding that swaps the bits in
each byte. This allows it to use the existing computeKnownBits
support for RISCVISD::GREV.
ClangUserExpression.h is relying on the forward declaration of
ClangExpressionParser in ClangFunctionCaller.h. This patch moves the
forward declaration to ClangUserExpression.h.
Limit this to SSE41 - AVX1 targets to avoid UNPCKL(PSHUFB,PSHUFB), pre-SSE41 we don't have PACKUSDW/BLENDW and with AVX2 we can perform this as PERMQ(PSHUFB()).
`__builtin_c[tl]z` accepts `unsigned int` argument that is not always the
same as uint32_t. For example, `unsigned int` is uint16_t on MSP430.
Reviewed By: aykevl
Differential Revision: https://reviews.llvm.org/D86547