This clean-up removes checks for _WIN64, as the _WIN32 macro returns 1
whenever the compilation targe is 32- or 64-bit ARM.
Reviewed By: aeubanks
Differential Revision: https://reviews.llvm.org/D106706
Bug 50022 [0] reports target nowait fails in certain case, which is added in this
patch. The root cause of the failure is, when the second task is created, its
parent's `td_incomplete_child_tasks` will not be incremented because there is no
parallel region here thus its team is serialized. Therefore, when the initial
thread is waiting for its unfinished children tasks, it thought there is only
one, the first task, because it is hidden helper task, so it is tracked. The
second task will only be pushed to the queue when the first task is finished.
However, when the first task finishes, it first decrements the counter of its
parent, and then release dependences. Once the counter is decremented, the thread
will move on because its counter is reset, but actually, the second task has not
been executed at all. As a result, since in this case, the main function finishes,
then `libomp` starts to destroy. When the second task is pushed somewhere, all
some of the structures might already have already been destroyed, then anything
could happen.
This patch simply moves `__kmp_release_deps` ahead of decrement of the counter.
In this way, we can make sure that the initial thread is aware of the existence
of another task(s) so it will not move on. In addition, in order to tackle
dependence chain starting with hidden helper thread, when hidden helper task is
encountered, we force the task to release dependences.
Reference:
[0] https://bugs.llvm.org/show_bug.cgi?id=50022
Reviewed By: AndreyChurbanov
Differential Revision: https://reviews.llvm.org/D106519
Unrolling this loop provides better performance in practice because it is
executed on the device and is likely to be very small.
Reviewed By: tianshilei1992
Differential Revision: https://reviews.llvm.org/D106692
Add the ability to:
1. tell simctl to wait for debugger when spawning process
2. print the command that is called to launch the process
Reviewed By: delcypher
Differential Revision: https://reviews.llvm.org/D106700
- Rename isLastUse to isDeadAfter to reflect what the function does.
- Avoid a second walk over all operations in BlockInfoBuilder constructor.
- use std::move() to save the new in set.
Differential Revision: https://reviews.llvm.org/D106702
The check for sinking instructions past the load + cmp sequence
currently checks for side-effects, which includes writing to memory
and unwinding. However, I don't believe we care about sinking the
instructions past an unwind (as they don't have any side-effects
themselves).
Differential Revision: https://reviews.llvm.org/D106591
This patch tries to partially fix one of the two data race issues reported in
[1] by introducing a per-entry mutex. Additional discussion can also be found in
D104418, which will also be refined to fix another data race problem.
Here is how it works. Like before, `DataMapMtx` is still being used for mapping
table lookup and update. In any case, we will get a table entry. If we need to
make a data transfer (update the data on the device), we need to lock the entry
right before releasing `DataMapMtx`, and the issue of data transfer should be
after releasing `DataMapMtx`, and the entry is unlocked afterwards. This can
guarantee that: 1) issue of data movement is not in critical region, which will
not affect performance too much, and also will not affect other threads that don't
touch the same entry; 2) if another thread accesses the same entry, the state of
data movement is consistent (which requires that a thread must first get the
update lock before getting data movement information).
For a target that doesn't support async data transfer, issue of data movement is
data transfer. This two-lock design can potentially improve concurrency compared
with the design that guards data movement with `DataMapMtx` as well. For a target
that supports async data movement, we could simply attach the event between the
issue of data movement and unlock the entry. For a thread that wants to get the
event, it must first get the lock. This can also get rid of the busy wait until
the event pointer is valid.
Reference:
[1] https://bugs.llvm.org/show_bug.cgi?id=49940
Reviewed By: grokos
Differential Revision: https://reviews.llvm.org/D104555
This matches what MS rc.exe allows in practice. I'm not aware of
any legal syntax case that are broken by allowing dashes as part
of what the tokenizer considers an Identifier - but I'm not
very well versed in the RC syntax either, can @amccarth think of
any case that would be broken by this?
This fixes downstream bug
https://github.com/msys2/MINGW-packages/issues/9180.
Additionally, rc.exe allows such resource name strings to be surrounded
by quotes, ending up with e.g.
Resource name (string): "QUOTEDNAME"
(i.e., the quotes end up as part of the string), which llvm-rc doesn't
support yet either. (I'm not aware of such cases in the wild though,
but resource string names with dashes do exist.)
This also allows including files with unquoted paths, with filenames
containing dashes (which fixes
https://github.com/msys2/MINGW-packages/issues/9130, which has been
worked around differently so far).
Differential Revision: https://reviews.llvm.org/D106598
With this, libclang_rt.profile_osx.a can be linked, that is coverage
and PGO-instrumented builds should now work with lld.
section$start and section$end symbols can create non-existing sections.
They're also undefined symbols that are only magic if there isn't a
regular symbol with their name, which means the need to be handled
in treatUndefined() instead of just looping over all existing
sections and adding start and end symbols like the ELF port does.
To represent the actual symbols, this uses absolute symbols that
get their value updated once an output section is layed out.
segment$start and segment$end are still missing for now, but they produce a
nicer error message after this patch.
Main part of PR50760.
Differential Revision: https://reviews.llvm.org/D106629
This commit modifies stepWithDwarf allowing for CFI directives to
specify the value of the stack pointer.
Previously, the SP would be unconditionally set to the CFA, because it
(wrongly) stated that the CFA is the stack pointer at the call site of a
function, but that is not always true.
One situation in which that is false, is for example if you have
switched stacks. In that case if you set the CFA to the SP before
switching the stack, the CFA would be far away from where the current
call frame is located.
The CFA always points to the current call frame, and that call frame
could have a CFI directive that specifies how to restore the stack
pointer. If not, it is OK to fallback and set the SP = CFA.
This change sets SP = CFA before restoring the registers during
unwinding, allowing the stack frame to be restored with a value
different than the CFA.
Reviewed By: #libunwind, phosek
Differential Revision: https://reviews.llvm.org/D106626
This type is 'fat' now thanks to the callbacks, so it should never be
copied as far as I know. Delete the copy operations so that we don't do
so accidentially.
Avoid buffering just to copy the buffered data, in 'development
mode', when logging. Instead, just populate the underlying protobuf.
Differential Revision: https://reviews.llvm.org/D106592
Since we are using assumed information now, the logic should be refined to avoid
unncessary assertion.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D106630
This patch adds LLVM_LIBC_INCLUDE_SCUDO as a flag. When enabled it
should link in the standalone version of SCUDO as the allocator for LLVM
libc.
Reviewed By: sivachandra
Differential Revision: https://reviews.llvm.org/D106502
Move the tests to libcxx so they no longer need `REQUIRES: libc++`.
Verify tests don't need `REQUIRES: libc++`.
Reviewed By: #libc, ldionne
Differential Revision: https://reviews.llvm.org/D106673
This patch adds a reduced test case which identifies an illegal vsetvli
inserted by the compiler. The compiler emits a vsetvli which is intended
to preserve VL with the SEW/LMUL ratio e32/m1 when in fact the VL could
have been set by e64/m2 in a predecessor block.
Differential Revision: https://reviews.llvm.org/D106286
Since we're changing VTYPE, we may change VLMAX which could
invalidate the previous VL. If we can't tell if it is safe we
should use an AVL of 1 instead of keeping the old VL.
This is a quick fix. We may want to thread VL to the pseudo
instruction instead of making up a value. That will require ISD
opcode changes and changes to the C intrinsic interface.
This fixes the issue raised in D106286.
Reviewed By: frasercrmck
Differential Revision: https://reviews.llvm.org/D106403
This addition reads command line input to run specific single tests
within a larger call to run all the tests for a particular function.
When the user adds a second argument to the command line, the code skips
all the tests that don't match the user's specified binary. If the user
doesn't specify a test correctly and/or no tests are run, a failure
message prints.
Reviewed By: sivachandra, aeubanks
Differential Revision: https://reviews.llvm.org/D105843
This code tries to form a TEST from CMP+AND with an optional
truncate in between. If we looked through the truncate, we may
have extra bits in the AND mask that shouldn't participate in
the checks. Normally SimplifyDemendedBits takes care of this, but
the AND may have another user. So manually mask out any extra bits.
Fixes PR51175.
Differential Revision: https://reviews.llvm.org/D106634
We lacked a test for bitcode symbol precedence. We assumed that
they followed the same rules as their regular symbol counterparts, but
never had a test to verify that we were matching ld64's behavior. It
turns out that we were largely correct, though we deviate from ld64 when
there are bitcode and non-bitcode symbols of the same name. The test
added in this diff both verifies our behavior and documents the
differences.
Reviewed By: #lld-macho, thakis
Differential Revision: https://reviews.llvm.org/D106596
We had a comment that claimed that defined symbols had priority
over common symbols if they occurred in the same archive. In fact, they
appear to have equal precedence. Our implementation already does this,
so I'm just updating the test comment. Also added a few other test
comments along the way for readability.
Reviewed By: #lld-macho, thakis
Differential Revision: https://reviews.llvm.org/D106595
In particular, relocations to absolute symbols or literal sections can
be handled in equalsConstant(), since their output addresses will not
change across each iteration of ICF. Offsets and addends can also be
dealt with entirely in equalsConstant(), making the code somewhat easier
to reason about. Only ConcatInputSections need to be handled in
equalsVariable().
LLD-ELF's implementation takes a similar approach.
Although this should make ICF do less work, in practice it seems like
there is no stat sig difference in time taken when linking
chromium_framework.
This refactor is motivated by an upcoming diff which improves ICF's handling of
addends.
Reviewed By: #lld-macho, gkm
Differential Revision: https://reviews.llvm.org/D106212
I found icf.s a bit hard to work with as it was not possible to
extend any of the functions `_a` ... `_k` to test new relocation /
referent types without modifying every single one of them. Additionally,
their one-letter names were not descriptive (though the comments
helped).
I've renamed all the functions to reflect the feature they are testing,
and shrunk them so that they contain just enough to test that one
feature.
I've also added tests for non-zero addends (via the
`_abs1a_ref_with_addend` and `_defined_ref_with_addend_1` functions).
Reviewed By: #lld-macho, gkm
Differential Revision: https://reviews.llvm.org/D106211
This patch is the initial support, it implements translation from object file to JIT link graph, and very few relocations were supported. Currently, the test file ELF_pc_indirect.s is passed, the HelloWorld program(compiled with mno-relax flag) can be linked correctly and run on instruction emulator correctly.
In the downstream implementation, I have implemented the GOT, PLT function, and EHFrame and some optimization will be implement soon. I will organize the code in to patches, then gradually send it to upstream.
Differential Revision: https://reviews.llvm.org/D105429
segment$start$/segment$end$ symbols allow creating segments without
sections, so getting the segment address off the first section
won't work there. Storing the address on the segment is arguably a
bit simpler too.
No behavior change, part of PR50760.
Differential Revision: https://reviews.llvm.org/D106665
When the output indexing map has a permutation we need to consider in
the contraction vector type.
Differential Revision: https://reviews.llvm.org/D106469
Most modern tools only accept two-dash long options. Remove one-dash
long options which are not recognized by GNU style `getopt_long`.
This ensures long options cannot collide with grouped short options.
Note: llvm-symbolizer has `-demangle={true,false}` for pprof compatibility
(for a while). They are kept.
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D106377
Absolute symbols have a nullptr isec. buildInputSectionPriorities()
would defer isec, causing crashes. Ordering absolute symbols doesn't
make sense, so just ignore them. This seems to match ld64.
Differential Revision: https://reviews.llvm.org/D106628