In D106041, a freeze was added before the branch condition to solve the miscompilation problem of SimpleLoopUnswitch.
However, I found that the added freeze disturbed other optimizations in the following situations.
```
arg.fr = freeze(arg)
use(arg.fr)
...
use(arg)
```
It is a problem that occurred when arg and arg.fr were recognized as different values.
Therefore, changing to use arg.fr instead of arg throughout the function eliminates the above problem.
Thus, I add a function that changes all uses of arg to freeze(arg) to visitFreeze of InstCombine.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D106233
Currently when linking LLVM against Libxml2, a simple check is performed to check whether it can be linked successfully. This check currently adds the include directories and the libraries for libxml2, but not definitions found by the config.
This causes issues on Windows when trying to link against a static libxml2. Libxml2 requires LIBXML_STATIC to be defined in the preprocessor to be able to link statically. This definition is put into LIBXML2_DEFINITIONS in the cmake config, but not properly forwarded to check_symbol_exists leading to it failing as it could not find xmlReadMemory in a DLL.
This patch simply appends the content of LIBXML2_DEFINITIONS to the symbol check definitions, fixing the issue.
Differential Revision: https://reviews.llvm.org/D106740
This is just a workaround. Pass the `-mllvm,-O0` link flags only if its
not ThinLTO. Doing that with ThinLTO currently results in an error:
```
Remaining virtual register operands
UNREACHABLE executed at .../llvm/lib/CodeGen/MachineRegisterInfo.cpp:209!
```
This patch introduces a pass that uses the Attributor to deduce AMDGPU specific attributes.
Reviewed By: jdoerfert, arsenm
Differential Revision: https://reviews.llvm.org/D104997
When a WRITE overwrites an endfile record, we need to forget
that there was an endfile record. When doing a BACKSPACE
after an explicit ENDFILE statement, the position afterwards
must be upon the endfile record.
Attempts to join list-directed delimited character input across
record boundaries was due to a bad reading of the standard
and has been deleted, now that the requirements are better understood.
This problem would cause a read attempt past EOF if a delimited
character input value was at the end of a record.
It turns out that delimited list-directed (and NAMELIST) character
output is required to emit contiguous doubled instances of the
delimiter character when it appears in the output value. When
fixed-size records are being emitted, as is the case with internal
output, this is not possible when the problematic character falls
on the last position of a record. No two other Fortran compilers
do the same thing in this situation so there is no good precedent
to follow.
Because it seems least wrong, with this patch we now emit one copy
of the delimiter as the last character of the current record and
another as the first character of the next record. (The
second-least-wrong alternative might be to flag a runtime error,
but that seems harsh since it's not an explicit error in the standard,
and the output may not have to be usable later as input anyway.)
Consequently, the output is not suitable for use as list-directed or
NAMELIST input.
If a later standard were to clarify this case, this behavior will of
course change as needed to conform.
Differential Revision: https://reviews.llvm.org/D106695
NAMELIST I/O formatting uses the runtime infrastructure for
list-directed I/O. List-directed input processing has same state
that requires reinitialization for each successive NAMELIST input
item. This patch fixes bugs with "null" items and repetition counts
on NAMELIST input items after the first in the group.
Differential Revision: https://reviews.llvm.org/D106694
This patch adds a virtual method HasError to fields, it can be
overridden by fields that have errors. Additionally, a form method
CheckFieldsValidity was added to be called by actions that expects all
the field to be valid.
Differential Revision: https://reviews.llvm.org/D106459
This patch adds a new Platform Plugin Field. It is a choices field that
lists all the available platform plugins and can retrieve the name of the
selected plugin. The default selected plugin is the currently selected
one. This patch also allows for arbitrary scrolling to make scrolling
easier when setting choices.
Differential Revision: https://reviews.llvm.org/D106483
D104406 introduced an error in which, if there are multiple matchings rules for a given path, lldb was only checking for the validity in the filesystem of the first match instead of looking exhaustively one by one until a valid file is found.
Besides that, a call to consume_front was being done incorrectly, as it was modifying the input, which renders subsequent matches incorrect.
I added a test that checks for both cases.
Differential Revision: https://reviews.llvm.org/D106723
* ELF supports `nodeduplicate`.
* ELF calls the concept "section group". `GRP_COMDAT` emulates the PE COMDAT deduplication feature.
* "COMDAT group" is an ELF term. Avoid it for PE/COFF.
* WebAssembly supports comdat but only supports the `any` selection kind. https://bugs.llvm.org/show_bug.cgi?id=50531
* A comdat must be included or omitted as a unit. Both the compiler and the linker must obey this rule.
* A global object can be a member of at most one comdat.
* COFF requires a non-local linkage for non-`nodeduplicate` selection kinds.
* llvm.global_ctors/.llvm.global_dtors: if the third field is used on ELF, it must reference a global variable or function in a comdat
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D106300
checkForAllInstructions was not handling declarations correctly.
It should have been returning false when it gets called on a declaration
The patch also fixes a test case for AAFunctionReachability for it to be able
to pass after the changes to the checkForAllinstructions.
Differential Revision: https://reviews.llvm.org/D106625
This is referenced in several of the cmake files that are part of an llvm install and it is also useful by downstream components such as onnx-mlir.
Differential Revision: https://reviews.llvm.org/D106686
Eli pointed out the issue when reviewing D104140. The max trip count logic makes an assumption that the value of IV changes. When the step is zero, the nowrap fact becomes trivial, and thus there's nothing preventing the loop from being nearly infinite. (The "nearly" part is because mustprogress may disallow an infinite loop while still allowing 999999999 iterations before RHS happens to allow an exit.)
This is very difficult to see in practice. You need a means to produce a loop varying RHS in a mustprogress loop which doesn't allow the loop to be infinite. In most cases, LICM or SCEV are smart enough to remove the loop varying expressions.
Differential Revision: https://reviews.llvm.org/D106327
Fuchsia's death test framework runs the closure which can die in a
different thread. Hence, the FP exceptions which cause the closure to
die should be enalbed in the closure.
Reviewed By: michaelrj
Differential Revision: https://reviews.llvm.org/D106683
This is the last instance of td_srcs in MLIR core build files. `deps` is
generally preferred. There are still some cases where `td_srcs` is
useful where creating a td_library would just be another layer of
indirection, so not (yet) dropping it entirely.
Differential Revision: https://reviews.llvm.org/D106716
This was previously combining indices even though they operate on
different types. For non-opaque pointers, the condition is
automatically satisfied based on the pointer types being equal.
Replace the clang builtins and LLVM intrinsics for {f32x4,f64x2}.{pmin,pmax}
with standard codegen patterns. Since wasm_simd128.h uses an integer vector as
the standard single vector type, the IR for the pmin and pmax intrinsic
functions contains bitcasts that would not be there otherwise. Add extra codegen
patterns that can still select the pmin and pmax instructions in the presence of
these bitcasts.
Differential Revision: https://reviews.llvm.org/D106612
Fixes PR 51174. c++14 should be a more portable option than gnu++14.
Reviewed By: tianshilei1992
Differential Revision: https://reviews.llvm.org/D106632
Otherwise e.g. the FoldTwoEntryPHINode() has to do a lot of legwork
to re-deduce what is the dominant block (i.e. for which block
is this branch the terminator).
This clean-up removes checks for _WIN64, as the _WIN32 macro returns 1
whenever the compilation targe is 32- or 64-bit ARM.
Reviewed By: aeubanks
Differential Revision: https://reviews.llvm.org/D106706
Bug 50022 [0] reports target nowait fails in certain case, which is added in this
patch. The root cause of the failure is, when the second task is created, its
parent's `td_incomplete_child_tasks` will not be incremented because there is no
parallel region here thus its team is serialized. Therefore, when the initial
thread is waiting for its unfinished children tasks, it thought there is only
one, the first task, because it is hidden helper task, so it is tracked. The
second task will only be pushed to the queue when the first task is finished.
However, when the first task finishes, it first decrements the counter of its
parent, and then release dependences. Once the counter is decremented, the thread
will move on because its counter is reset, but actually, the second task has not
been executed at all. As a result, since in this case, the main function finishes,
then `libomp` starts to destroy. When the second task is pushed somewhere, all
some of the structures might already have already been destroyed, then anything
could happen.
This patch simply moves `__kmp_release_deps` ahead of decrement of the counter.
In this way, we can make sure that the initial thread is aware of the existence
of another task(s) so it will not move on. In addition, in order to tackle
dependence chain starting with hidden helper thread, when hidden helper task is
encountered, we force the task to release dependences.
Reference:
[0] https://bugs.llvm.org/show_bug.cgi?id=50022
Reviewed By: AndreyChurbanov
Differential Revision: https://reviews.llvm.org/D106519
Unrolling this loop provides better performance in practice because it is
executed on the device and is likely to be very small.
Reviewed By: tianshilei1992
Differential Revision: https://reviews.llvm.org/D106692
Add the ability to:
1. tell simctl to wait for debugger when spawning process
2. print the command that is called to launch the process
Reviewed By: delcypher
Differential Revision: https://reviews.llvm.org/D106700
- Rename isLastUse to isDeadAfter to reflect what the function does.
- Avoid a second walk over all operations in BlockInfoBuilder constructor.
- use std::move() to save the new in set.
Differential Revision: https://reviews.llvm.org/D106702
The check for sinking instructions past the load + cmp sequence
currently checks for side-effects, which includes writing to memory
and unwinding. However, I don't believe we care about sinking the
instructions past an unwind (as they don't have any side-effects
themselves).
Differential Revision: https://reviews.llvm.org/D106591
This patch tries to partially fix one of the two data race issues reported in
[1] by introducing a per-entry mutex. Additional discussion can also be found in
D104418, which will also be refined to fix another data race problem.
Here is how it works. Like before, `DataMapMtx` is still being used for mapping
table lookup and update. In any case, we will get a table entry. If we need to
make a data transfer (update the data on the device), we need to lock the entry
right before releasing `DataMapMtx`, and the issue of data transfer should be
after releasing `DataMapMtx`, and the entry is unlocked afterwards. This can
guarantee that: 1) issue of data movement is not in critical region, which will
not affect performance too much, and also will not affect other threads that don't
touch the same entry; 2) if another thread accesses the same entry, the state of
data movement is consistent (which requires that a thread must first get the
update lock before getting data movement information).
For a target that doesn't support async data transfer, issue of data movement is
data transfer. This two-lock design can potentially improve concurrency compared
with the design that guards data movement with `DataMapMtx` as well. For a target
that supports async data movement, we could simply attach the event between the
issue of data movement and unlock the entry. For a thread that wants to get the
event, it must first get the lock. This can also get rid of the busy wait until
the event pointer is valid.
Reference:
[1] https://bugs.llvm.org/show_bug.cgi?id=49940
Reviewed By: grokos
Differential Revision: https://reviews.llvm.org/D104555
This matches what MS rc.exe allows in practice. I'm not aware of
any legal syntax case that are broken by allowing dashes as part
of what the tokenizer considers an Identifier - but I'm not
very well versed in the RC syntax either, can @amccarth think of
any case that would be broken by this?
This fixes downstream bug
https://github.com/msys2/MINGW-packages/issues/9180.
Additionally, rc.exe allows such resource name strings to be surrounded
by quotes, ending up with e.g.
Resource name (string): "QUOTEDNAME"
(i.e., the quotes end up as part of the string), which llvm-rc doesn't
support yet either. (I'm not aware of such cases in the wild though,
but resource string names with dashes do exist.)
This also allows including files with unquoted paths, with filenames
containing dashes (which fixes
https://github.com/msys2/MINGW-packages/issues/9130, which has been
worked around differently so far).
Differential Revision: https://reviews.llvm.org/D106598
With this, libclang_rt.profile_osx.a can be linked, that is coverage
and PGO-instrumented builds should now work with lld.
section$start and section$end symbols can create non-existing sections.
They're also undefined symbols that are only magic if there isn't a
regular symbol with their name, which means the need to be handled
in treatUndefined() instead of just looping over all existing
sections and adding start and end symbols like the ELF port does.
To represent the actual symbols, this uses absolute symbols that
get their value updated once an output section is layed out.
segment$start and segment$end are still missing for now, but they produce a
nicer error message after this patch.
Main part of PR50760.
Differential Revision: https://reviews.llvm.org/D106629
This commit modifies stepWithDwarf allowing for CFI directives to
specify the value of the stack pointer.
Previously, the SP would be unconditionally set to the CFA, because it
(wrongly) stated that the CFA is the stack pointer at the call site of a
function, but that is not always true.
One situation in which that is false, is for example if you have
switched stacks. In that case if you set the CFA to the SP before
switching the stack, the CFA would be far away from where the current
call frame is located.
The CFA always points to the current call frame, and that call frame
could have a CFI directive that specifies how to restore the stack
pointer. If not, it is OK to fallback and set the SP = CFA.
This change sets SP = CFA before restoring the registers during
unwinding, allowing the stack frame to be restored with a value
different than the CFA.
Reviewed By: #libunwind, phosek
Differential Revision: https://reviews.llvm.org/D106626