Remove uses of to-be-deprecated API. I've fallen back to calling
getPointerElementType() in some cases where the correct type wasn't
immediately obvious to me.
At most these use the StringRef/Twine wrappers and don't have any implicit uses of std::string.
Move the include down to any cpp implementation where std::string is actually used.
Use the elementtype attribute introduced in D105407 for the
llvm.preserve.array/struct.index intrinsics. It carries the
element type of the GEP these intrinsics effectively encode.
This patch:
* Adds a verifier check that the attribute is required.
* Adds it in the IRBuilder methods for these intrinsics.
* Autoupgrades old bitcode without the attribute.
* Updates the lowering code to use the attribute rather than
the pointer element type.
* Updates lots of tests to specify the attribute.
* Adds -force-opaque-pointers to the intrinsic-array.ll test
to demonstrate they work now.
https://reviews.llvm.org/D106184
We assume VLENB is a multiple of 8 and previously relied on shift
pairs being optimized to an AND+SHL/SHR and computeKnownBits
removing the AND. This doesn't happen if (vlenb >> 3) gets CSEd
to have multiple uses. This patch manually emits the best shift
to workaround this.
Parallel regions are outlined as functions with capture variables explicitly generated as distinct parameters in the function's argument list. That complicates the fork_call interface in the OpenMP runtime: (1) the fork_call is variadic since there is a variable number of arguments to forward to the outlined function, (2) wrapping/unwrapping arguments happens in the OpenMP runtime, which is sub-optimal, has been a source of ABI bugs, and has a hardcoded limit (16) in the number of arguments, (3) forwarded arguments must cast to pointer types, which complicates debugging. This patch avoids those issues by aggregating captured arguments in a struct to pass to the fork_call.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D102107
This simplifies the vector to LLVM lowering. Previously, both vector.load/store and vector.transfer_read/write lowered directly to LLVM. With this commit, there is a single path to LLVM vector load/store instructions and vector.transfer_read/write ops must first be lowered to vector.load/store ops.
* Remove vector.transfer_read/write to LLVM lowering.
* Allow non-unit memref strides on all but the most minor dimension for vector.load/store ops.
* Add maxTransferRank option to populateVectorTransferLoweringPatterns.
* vector.transfer_reads with changing element type can no longer be lowered to LLVM. (This functionality is needed only for SPIRV.)
Differential Revision: https://reviews.llvm.org/D106118
Iterative-BFI produces better count quality and performance when evaluated on internal benchmarks. Turning it on by default now for CSSPGO. We can consider turn it on by default for AutoFDO as well in the future.
Differential Revision: https://reviews.llvm.org/D106202
ptrauth stores info in the address of functions, so it's not the right address we should check if poisoned
rdar://75246928
Differential Revision: https://reviews.llvm.org/D106199
D104422 added the interface for TraceCursor, which is the main way to traverse instructions in a trace. This diff implements the corresponding cursor class for Intel PT and deletes the now obsolete code.
Besides that, the logic for the "thread trace dump instructions" was adapted to use this cursor (pretty much I ended up moving code from Trace.cpp to TraceCursor.cpp). The command by default traverses the instructions backwards, and if the user passes --forwards, then it's not forwards. More information about that is in the Options.td file.
Regarding the Intel PT cursor. All Intel PT cursors for the same thread share the same DecodedThread instance. I'm not yet implementing lazy decoding because we don't need it. That'll be for later. For the time being, the entire thread trace is decoded when the first cursor for that thread is requested.
Differential Revision: https://reviews.llvm.org/D105531
Turning on -funique-internal-linkage-names when -fpseudo-probe-for-profiling is on, unless -fno-unique-internal-linkage-names is specified.
Reviewed By: wenlei
Differential Revision: https://reviews.llvm.org/D106193
The current implementation of computeBECount doesn't account for the
possibility that adding "Stride - 1" to Delta might overflow. For almost
all loops, it doesn't, but it's not actually proven anywhere.
To deal with this, use a variety of tricks to try to prove that the
addition doesn't overflow. If the proof is impossible, use an alternate
sequence which never overflows.
Differential Revision: https://reviews.llvm.org/D105216
For example, I need this lately in my CI config:
LIT_XFAIL_NOT='libomptarget :: nvptx64-nvidia-cuda :: unified_shared_memory/api.c'
That test specifies an XFAIL directive, but I get an XPASS result.
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D106022
We've seen reports of crashes (none we've been able to reproduce
locally) that look like they are caused by concurrent access to a
thread plan stack. It looks like there are error paths when an
interrupt request to debugserver times out that cause this problem.
The thread plan stack access is never in a hot loop, and there
aren't enough of them for the extra data member to matter, so
there's really no good reason not to protect the access.
Adding the mutex revealed a couple of places where we were
using "auto" in an iteration when we should have been using
"auto &" - we didn't intend to copy the stack - and I fixed
those as well.
Except for preventing crashes this should be NFC.
Differential Revision: https\://reviews.llvm.org/D106122
libc++ has started splicing standard library headers into much more
fine-grained content for maintainability. It's very likely that outdated
and naive tooling (some of which is outside of LLVM's scope) will
suggest users include things such as `<__algorithm/find.h>` instead of
`<algorithm>`, and Hyrum's law suggests that users will eventually begin
to rely on this without the help of tooling. As such, this commit
intends to protect users from themselves, by making it a hard error for
anyone outside of the standard library to include libc++ detail headers.
This is the first of four patches. Patch #2 will solve the problem for
pre-processor `#include`s; patches #3 and #4 will solve the problem for
`<__tree>` and `<__hash_table>` (since I've never touched the test cases
that are failing for these two, I want to split them out into their own
commits to be extra careful). Patch #5 will concern itself with
`<__threading_support>`, which intersects with libcxxabi (which I know
even less about).
Differential Revision: https://reviews.llvm.org/D105932
Ensure that libSupport does not carry any static global initializer.
libSupport can be embedded in use cases where we don't want to load all
cl::opt unless we want to parse the command line.
ManagedStatic can be used to enable lazy-initialization of globals.
Create an internal alias with the original name for static functions
that are renamed in promoteInternals to avoid breaking inline
assembly references to them. This version uses module inline assembly
to avoid issues with LowerTypeTestsModule.
Link: https://github.com/ClangBuiltLinux/linux/issues/1354
Reviewed By: nickdesaulniers, pcc
Differential Revision: https://reviews.llvm.org/D104058
This provides intrinsics for emitting instructions that set the FPSCR (`mtfsf/mtfsfi`).
The patch also conservatively marks the rounding mode as an implicit def for both since they both may set the rounding mode depending on the operands.
Reviewed By: #powerpc, qiucf
Differential Revision: https://reviews.llvm.org/D105957
This avoids Bazel recursing into these directories when overlayed, which
will break if someone has run Bazel in these dirs (which would only be
successful with the http_archive example) because of the Bazel directory
symlinks (already gitignored).
Reviewed By: jpienaar
Differential Revision: https://reviews.llvm.org/D105357
Add explicit coverage provider. Also remove output_to_genfiles which
isn't necessary for this test (it's just copy-pasta from gentbl_rule,
which needs it for output C++ header files).
Reviewed By: jpienaar
Differential Revision: https://reviews.llvm.org/D106115
Part of D105020. Also, fixed FIXMEs that need to use wider vector type
when trying to calculate the cost of reused scalars. This may cause
regressions unless D100486 is landed to improve the cost estimations
for long vectors shuffling.
Differential Revision: https://reviews.llvm.org/D106060