Previously by following the documentation it was not immediately clear
what the capabilities of this checker are.
In this patch, I add some clarification on when does the checker issue a
report and what it's limitations are.
I'm also advertising suppressing such reports by adding an assertion, as
demonstrated by the test3().
I'm highlighting that this checker might produce an extensive amount of
findings, but it might be still useful for code audits.
Reviewed By: martong
Differential Revision: https://reviews.llvm.org/D107756
SelectADDRParam was discovered as being dead 5 years ago and removed in
7b4ef068c6 but the unused ComplexPattern definition was left behind.
SelectADDRDWord has never existed as far as I can tell, even back when
AMDGPU was R600-only and called that.
Reviewed By: foad
Differential Revision: https://reviews.llvm.org/D108758
This was used by CONVERT_RNDSAT, which was removed in def496c04b, so
the profile is now unused.
Reviewed By: xgupta
Differential Revision: https://reviews.llvm.org/D108508
It turns out that SchedWrite WriteIMulH was always assigned to the low half of
the result of a MULX (rather than to the high half).
To avoid confusion, this patch swaps the two MULX writes in the tablegen
definition of MULX32/64. That way, write names better describe what they
actually refer to; this also avoids further complications if in future we decide
to reuse the same MulH writes to also model other scalar integer multiply
instructions. I also had to swap the latency values for the two MULX writes to
make sure that the change is effectively an NFC. In fact, none of the existing
x86 tests were affected by this small refactoring.
This patch also fixes a bug in MCA: a wrong latency value was propagated for
instructions that perform multiple writes to a same register. This last issue
was found by Roman while testing MULX on targets that define a different latency
for the Low/High part of the result.
Differential Revision: https://reviews.llvm.org/D108727
We have to avoid calling renameat2 and clone on FreeBSD.
Additionally, the mcontext structure has different members.
Reviewed By: jrtc27, luismarques
Differential Revision: https://reviews.llvm.org/D103886
Klocwork static code analysis exposed this concern:
Pointer 'SubExpr' returned from call to getSubExpr() function which may
return NULL from 'cast_or_null<Expr>(Operand)', which will be
dereferenced in the statement following it
Add an assert on SubExpr to make it clear this pointer cannot be null.
Tell the cost model to use the scalable calculation for non-neon fixed vector.
This results in a cheaper cost for fixed-length SVE masked gathers/scatters
allowing the vectorizor to emit them more frequently.
In LLVM IR, `AlignmentBitfieldElementT` is 5-bit wide
But that means that the maximal alignment exponent is `(1<<5)-2`,
which is `30`, not `29`. And indeed, alignment of `1073741824`
roundtrips IR serialization-deserialization.
While this doesn't seem all that important, this doubles
the maximal supported alignment from 512MiB to 1GiB,
and there's actually one noticeable use-case for that;
On X86, the huge pages can have sizes of 2MiB and 1GiB (!).
So while this doesn't add support for truly huge alignments,
which i think we can easily-ish do if wanted, i think this adds
zero-cost support for a not-trivially-dismissable case.
I don't believe we need any upgrade infrastructure,
and since we don't explicitly record the IR version,
we don't need to bump one either.
As @craig.topper speculates in D108661#2963519,
this might be an artificial limit imposed by the original implementation
of the `getAlignment()` functions.
Differential Revision: https://reviews.llvm.org/D108661
This avoids references to the variables be generated when using e.g. max().
Reviewed By: dexonsmith
Differential Revision: https://reviews.llvm.org/D95050
The existing code attempting to bitcast from a value in the default globals AS
to i8 addrspace(0)* was triggering an assertion failure in our downstream fork.
I found this while compiling poppler for CHERI-RISC-V (we use AS200 for all
globals). The test case uses AMDGPU since that is one of the in-tree targets
with a non-zero default globals address space.
The new test previously triggered a "Invalid constantexpr bitcast!" assertion
and now correctly generates code with addrspace(1) pointers.
Reviewed By: rjmccall
Differential Revision: https://reviews.llvm.org/D105972
We should be using #if instead of #ifdef here since LLVM_ENABLE_THREADS
is set using #cmakedefine01 so is always defined.
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D108110
This patch adds initial support to use facts from @llvm.assume calls. It
intentionally does not handle all possible cases to keep things simple
initially.
For now, the condition from an assume is made available on entry to the
containing block, if the assume is guaranteed to execute. Otherwise it
is only made available in the successor blocks.
Control-flow Enforcement Technology (CET), published by Intel,
introduces shadow stack feature aiming to ensure a return from
a function is directed to where the function was called.
In a CET enabled system, each function call will push return
address into normal stack and shadow stack, when the function
returns, the address stored in shadow stack will be popped and
compared with the return address, program will fail if the 2
addresses don't match.
In exception handling, the control flow may skip some stack frames
and we must adjust shadow stack to avoid violating CET restriction.
In order to achieve this, we count the number of stack frames skipped
and adjust shadow stack by this number before jumping to landing pad.
Reviewed By: hjl.tools, compnerd, MaskRay
Differential Revision: https://reviews.llvm.org/D105968
Signed-off-by: gejin <ge.jin@intel.com>
ApplyElementwise on character operation was always creating a result
ArrayConstructor with the length of the left operand. This is not
correct for concatenation and SetLength operations.
Compute and thread the length to the spot creating the ArrayConstructor
so that the length is correct for those character operations.
Differential Revision: https://reviews.llvm.org/D108711
Summary: This patch is trying to add support for llvm-readobj
--needed-libs option under XCOFF.
For XCOFF, the needed libraries can be found from the Import
File ID Name Table of the Loader Section.
Currently, I am using binary inputs in the test since yaml2obj
does not yet support for writing the Loader Section and the
import file table.
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D106643
There are a number of language and preprocessor options that are reset in the `CompilerInvocation` that describes the build of an implicit module. This patch uses the logic for explicit modules as well.
Reviewed By: dexonsmith
Differential Revision: https://reviews.llvm.org/D108710
This prepares general sparse to sparse conversions. The code that
needs to be generated using this new feature is now simply:
(1) coo = sparse_tensor_1->asCOO(); // source format1
(2) sparse_tensor_2 = newSparseTensor(coo); // destination format2
By using COO as an intermediate, we can do *all* conversions without
having to implement the full O(N^2) conversion matrix. Note that we
can always improve particular conversions individually if a faster
solution is required.
Reviewed By: bixia
Differential Revision: https://reviews.llvm.org/D108681
The change adds a switch to allow sample loader to use global pre-inliner's decision instead. The pre-inliner in llvm-profgen makes inline decision globally based on whole program profile and function byte size as cost proxy.
Since pre-inliner also adjusts/merges context profile based on its inline decision, honoring its inline decision in sample loader would lead to better post-inline profile quality especially for thinlto where cross module profile merging isn't possible without pre-inliner.
Minor fix in profile reader is also included. When pre-inliner is use, we now also turn off the default merging and trimming logic unless it's explicitly asked.
Differential Revision: https://reviews.llvm.org/D108677
eecd5d0a broke non-clang host builds.
Some crt code is not always built with the just-built clang.
0da172b checked if the compiler is clang, not assert that the compiler
is clang.
Emscripten SjLj transformation is done in four steps. This will be
mostly the same for the soon-to-be-added Wasm SjLj; the step 1, 3, and 4
will be shared and there will be separate way of doing step 2.
1. Initialize `setjmpTable` and `setjmpTableSize` in the entry BB
2. Handle `setjmp` callsites
3. Handle `longjmp` callsites
4. Cleanup and update SSA
We initialize `setjmpTable` and `setjmpTableSize` in the entry BB. But
if the entry BB contains a `setjmp` call, some `setjmp` handling
transformation will also happen in the entry BB, such as calling
`saveSetjmp`.
This is fine for Emscripten SjLj but not for Wasm SjLj, because in Wasm
SjLj we will add a dispatch BB that contains a `switch` right after the
entry BB, from which we jump to one of post-`setjmp` BBs. And this
dispatch BB should precede all `setjmp` calls.
Emscripten SjLj (current):
```
entry:
%setjmpTable = ...
%setjmpTableSize = ...
...
call @saveSetjmp(...)
```
Wasm SjLj (follow-up):
```
entry:
%setjmpTable = ...
%setjmpTableSize = ...
setjmp.dispatch:
...
; Jump to the right post-setjmp BB, if we are returning from a
; longjmp. If this is the first setjmp call, go to %entry.split.
switch i32 %no, label %entry.split [
i32 1, label %post.setjmp1
i32 2, label %post.setjmp2
...
i32 N, label %post.setjmpN
]
entry.split:
...
call @saveSetjmp(...)
```
So in Wasm SjLj we split the entry BB to make the entry block only for
`setjmpTable` and `setjmpTableSize` initialization and insert a
`setjmp.dispatch` BB. (This part is not in this CL. This will be a
follow-up.) But note that Emscripten SjLj and Wasm SjLj share all
steps except for the step 2. If we only split the entry BB only for Wasm
SjLj, there will be one more `if`-`else` and the code will be more
complicated.
So this CL splits the entry BB in Emscripten SjLj and put only
initialization stuff there as follows:
Emscripten SjLj (this CL):
```
entry:
%setjmpTable = ...
%setjmpTableSize = ...
br %entry.split
entry.split:
...
call @saveSetjmp(...)
```
This is just done to share code with Wasm SjLj. It adds an unnecessary
branch but this will be removed in later optimization passes anyway.
This is in effect NFC, meaning the program behavior will not change, but
existing ll tests files have changed because the entry block was split.
The reason I upload this in a separate CL is to make the Wasm SjLj diff
tidier, because this changes many existing Emscripten SjLj tests, which
can be confusing for the follow-up Wasm SjLj CL.
Reviewed By: tlively
Differential Revision: https://reviews.llvm.org/D108729
Emscripten SjLj and (soon-to-be-added) Wasm SjLj transformation share
many steps:
1. Initialize `setjmpTable` and `setjmpTableSize` in the entry BB
2. Handle `setjmp` callsites
3. Handle `longjmp` callsites
4. Cleanup and update SSA
1, 3, and 4 are identical for Emscripten SjLj and Wasm SjLj. Only the
step 2 is different. This CL extracts the current Emscripten SjLj's
longjmp callsites handling into a function. The reason to make this a
separate CL is, without this, the diff tool cannot compare things well
in the presence of moved code and added code in the followup Wasm SjLj
CL, and it ends up mixing them together, making the diff unreadable.
Also fixes some typos and variable names. So far we've been calling the
buffer argument to `setjmp` and `longjmp` `jmpbuf`, but the name used in
the man page for those functions is `env`, so updated them to be
consistent.
Reviewed By: tlively
Differential Revision: https://reviews.llvm.org/D108728
This change would treat the token `or` in system headers as an
identifier, and elsewhere as an operator. As reported in
llvm.org/pr42427, many users classify their third party library headers
as "system" headers to suppress warnings. There's no clean way to
separate Windows SDK headers from user headers.
Clang is still able to parse old Windows SDK headers if C++ operator
names are disabled. Traditionally this was controlled by
`-fno-operator-names`, but is now also enabled with `/permissive` since
D103773. This change will prevent `clang-cl` from parsing <query.h> from
the Windows SDK out of the box, but there are multiple ways to work
around that:
- Pass `/clang:-fno-operator-names`
- Pass `/permissive`
- Pass `-DQUERY_H_RESTRICTION_PERMISSIVE`
In all of these modes, the operator names will consistently be available
or not available, instead of depending on whether the code is in a
system header.
I added a release note for this, since it may break straightforward
users of the Windows SDK.
Fixes PR42427
Differential Revision: https://reviews.llvm.org/D108720
Rename the M68kOperand::Type enumeration to KindTy to avoid ambiguity
with the Kind field when referencing enumeration values e.g.
`Kind::Value`.
This works around a compilation error under GCC 5, where GCC won't
lookup enum class values if you have a similarly named field
(see https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60994).
The error in question is:
`M68kAsmParser.cpp:857:8: error: 'Kind' is not a class, namespace, or enumeration`
Differential Revision: https://reviews.llvm.org/D108723
The plan was to use `wasm.catch.exn` intrinsic to catch exceptions and
add `wasm.catch.longjmp` intrinsic, that returns two values (setjmp
buffer and return value), later to catch longjmps. But because we
decided not to use multivalue support at the moment, we are going to use
one intrinsic that returns a single value for both exceptions and
longjmps. And even if it's not for that, I now think the naming of
`wasm.catch.exn` is a little weird, because the intrinsic can still take
a tag immediate, which means it can be used for anything, not only
exceptions, as long as that returns a single value.
This partially reverts D107405.
Reviewed By: tlively
Differential Revision: https://reviews.llvm.org/D108683
Breaks realpath(, nullptr) for all sanitizers.
Somehow INTERCEPT_FUNCTION and INTERCEPT_FUNCTION_VER return
false even if everything seemingly right.
And this is the issue for COMMON_INTERCEPT_FUNCTION_GLIBC_VER_MIN.
There is a check in every sanitlizer:
if (!INTERCEPT_FUNCTION_VER(name, ver) && !INTERCEPT_FUNCTION(name))
For non-versioned interceptors when INTERCEPT_FUNCTION returns false
it's not considered fatal, and it just prints a warning.
However INTERCEPT_FUNCTION_VER in this case will fallback to
INTERCEPT_FUNCTION replacing realpath with wrong version.
We need to investigate that before relanding the patch.
This reverts commit faef0d042f.
This patch adds a new type of reusable UI components. Searcher Windows
contain a text field to enter a search keyword and a list of scrollable
matches are presented. The target match can be selected and executed
which invokes a user callback to do something with the match.
This patch also adds one searcher delegate, which wraps the common
command completion searchers for simple use cases.
Reviewed By: clayborg
Differential Revision: https://reviews.llvm.org/D108545