These seem to fail occasionally (they are marked as possibly requiring
a retry).
When doing a condvar wait_for(), it can wake up before the timeout
as a spurious wakeup. In these cases, the wait_for() method returns that
the timeout wasn't hit, and the test reruns another wait_for().
On Windows, it seems like the wait_for() operation often can end up
returning slightly before the intended deadline - when intending to
wait for 250 milliseconds, it can return after e.g. 235 milliseconds.
In these cases, the wait_for() doesn't indicate a timeout.
Previously, the test then reran a new wait_for() for a full 250
milliseconds each time. So for N consecutive wakeups slightly too early,
we'd wait for (N+1)*250 milliseconds. Now it only reruns wait_for() for
the remaining intended wait duration.
Differential Revision: https://reviews.llvm.org/D99175
This implements the most basic possible nosync inference. The choice of inference rule is taken from the comments in attributor and the discussion on the review of the change which introduced the nosync attribute (0626367202).
This is deliberately minimal. As noted in code comments, I do plan to add a more robust inference which actually scans the function IR directly, but a) I need to do some refactoring of the attributor code to use common interfaces, and b) I wanted to get something in. I also wanted to minimize the "interesting" analysis discussion since that's time intensive.
Context: This combines with existing nofree attribute inference to help prove dereferenceability in the ongoing deref-at-point semantics work.
Differential Revision: https://reviews.llvm.org/D99749
Inline callstacks were being incorrectly displayed in the results of "image lookup --address". The deepest frame wasn't displaying the line table line entry, it was always showing the inline information's call file and line on the previous frame. This is now fixed and has tests to make sure it doesn't regress.
Differential Revision: https://reviews.llvm.org/D98761
Hookup TLI when inferring object size from allocation calls. This allows the analysis to prove dereferenceability for known allocation functions (such as malloc/new/etc) in addition to those marked explicitly with the allocsize attribute.
This is a follow up to 0129cd5 now that the bug fixed by e2c6621e6 is resolved.
As noted in the test, this relies on being able to prove that there is no free between allocation and context (e.g. hoist location). At the moment, this is handled conservatively. I'm working strengthening out ability to reason about no-free regions separately.
Differential Revision: https://reviews.llvm.org/D99737
This doesn't fail when _LIBCPP_HAS_NO_INT128 is defined consistently
in both CMAKE_CXX_FLAGS and LIBCXX_TEST_COMPILER_FLAGS; the XFAIL was
added based on early CI testruns where that flag was missing in
LIBCXX_TEST_COMPILER_FLAGS.
Differential Revision: https://reviews.llvm.org/D99705
We have this logic duplicated in several cases, none of which were exhaustive. Consolidate it in one place.
I don't believe this actually impacts behavior of the callers. I think they all filter their inputs such that their partial implementations were correct. If not, this might be fixing a cornercase bug.
Add runtime APIs, implementations, and tests for ALL, ANY, COUNT,
MAXLOC, MAXVAL, MINLOC, MINVAL, PRODUCT, and SUM reduction
transformantional intrinsic functions for all relevant argument
and result types and kinds, both without DIM= arguments
(total reductions) and with (partial reductions).
Complex-valued reductions have their APIs in C so that
C's _Complex types can be used for their results.
Some infrastructure work was also necessary or noticed:
* Usage of "long double" in the compiler was cleaned up a
bit, and host dependences on x86 / MSVC have been isolated
in a new Common/long-double header.
* Character comparison has been exposed via an extern template
so that reductions could use it.
* Mappings from Fortran type category/kind to host C++ types
and vice versa have been isolated into runtime/cpp-type.h and
then used throughout the runtime as appropriate.
* The portable 128-bit integer package in Common/uint128.h
was generalized to support signed comparisons.
* Bugs in descriptor indexing code were fixed.
Differential Revision: https://reviews.llvm.org/D99666
Since quite a while Apple's LLDB fork (that contains the Swift debugging
support) is randomly crashing in `CommandLineParser::addOption` with an error
such as `CommandLine Error: Option 'h' registered more than once!`
The backtrace of the crashing thread is shown below. There are also usually many
other threads also performing similar clang::FrontendActions which are all
trying to generate (usually outdated) Clang modules which are used by Swift for
various reasons.
```
[ 6] LLDB`CommandLineParser::addOption(llvm:🆑:Option*, llvm:🆑:SubCommand*) + 856
[ 7] LLDB`CommandLineParser::addOption(llvm:🆑:Option*, llvm:🆑:SubCommand*) + 733
[ 8] LLDB`CommandLineParser::addOption(llvm:🆑:Option*, bool) + 184
[ 9] LLDB`llvm:🆑:ParseCommandLineOptions(...) [inlined] ::CommandLineParser::ParseCommandLineOptions(... + 1279
[ 9] LLDB`llvm:🆑:ParseCommandLineOptions(...) + 497
[ 10] LLDB`setCommandLineOpts(clang::CodeGenOptions const&) + 416
[ 11] LLDB`EmitAssemblyHelper::EmitAssemblyWithNewPassManager(...) + 98
[ 12] LLDB`clang::EmitBackendOutput(...) + 4580
[ 13] LLDB`PCHContainerGenerator::HandleTranslationUnit(clang::ASTContext&) + 871
[ 14] LLDB`clang::MultiplexConsumer::HandleTranslationUnit(clang::ASTContext&) + 43
[ 15] LLDB`clang::ParseAST(clang::Sema&, bool, bool) + 579
[ 16] LLDB`clang::FrontendAction::Execute() + 74
[ 17] LLDB`clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) + 1808
```
The underlying reason for the crash is that the CommandLine code in LLVM isn't
thread-safe and will never be thread-safe with its current architecture. The way
LLVM's CommandLine logic works is that all parts of the LLVM can provide command
line arguments by defining `cl::opt` global variables and their constructors
(which are invoked during static initialisation) register the variable in LLVM's
CommandLineParser (which is also just a global variable). At some later point
after static initialization we actually try to parse command line arguments and
we ask the CommandLineParser to parse our `argv`. The CommandLineParser then
lazily constructs it's internal parsing state in a non-thread-safe way (this is
where the crash happens), parses the provided command line and then goes back to
the respective `cl::opt` global variables and sets their values according to the
parse result.
As all of this is based on global state, this whole mechanism isn't thread-safe
so the only time to ever use it is when we know we only have one active thread
dealing with LLVM logic. That's why nearly all callers of
`llvm:🆑:ParseCommandLineOptions` are at the top of the `main` function of the
some LLVM-based tool. One of the few exceptions to this rule is in the
`setCommandLineOpts` function in `BackendUtil.cpp` which is in our backtrace:
```
static void setCommandLineOpts(const CodeGenOptions &CodeGenOpts) {
SmallVector<const char *, 16> BackendArgs;
BackendArgs.push_back("clang"); // Fake program name.
if (!CodeGenOpts.DebugPass.empty()) {
BackendArgs.push_back("-debug-pass");
BackendArgs.push_back(CodeGenOpts.DebugPass.c_str());
}
if (!CodeGenOpts.LimitFloatPrecision.empty()) {
BackendArgs.push_back("-limit-float-precision");
BackendArgs.push_back(CodeGenOpts.LimitFloatPrecision.c_str());
}
BackendArgs.push_back(nullptr);
llvm:🆑:ParseCommandLineOptions(BackendArgs.size() - 1,
BackendArgs.data());
}
```
This is trying to set `cl::opt` variables in the LLVM backend to their right
value as the passed via CodeGenOptions by invoking the CommandLine parser. As
this is just in some generic Clang CodeGen code (where we allow having multiple
threads) this is code is clearly wrong. If we're unlucky it either overwrites
the value of the global variables or it causes the CommandLine parser to crash.
So the next question is why is this only crashing in LLDB? The main reason seems
to be that easiest way to crash this code is to concurrently enter the initial
CommandLineParser construction where it tries to collect all the registered
`cl::opt` options and checks for sanity:
```
// If it's a DefaultOption, check to make sure it isn't already there.
if (O->isDefaultOption() &&
SC->OptionsMap.find(O->ArgStr) != SC->OptionsMap.end())
return;
// Add argument to the argument map!
if (!SC->OptionsMap.insert(std::make_pair(O->ArgStr, O)).second) {
errs() << ProgramName << ": CommandLine Error: Option '" << O->ArgStr
<< "' registered more than once!\n";
HadErrors = true;
}
```
The `OptionsMap` here is global variable and if we end up in this code with two
threads at once then two threads at the same time can register an option (such
as 'h') when they pass the first `if` and then we fail with the sanity check in
the second `if`.
After this sanity check and initial setup code the only remaining work is just
parsing the provided CommandLine which isn't thread-safe but at least doesn't
crash in all my attempts at breaking it (as it's usually just reading from the
already generated parser state but not further modifying it). The exception to
this is probably that once people actually specify the options in the code
snippet above we might run into some new interesting ways to crash everything.
To go back to why it's only affecting LLDB: Nearly all LLVM tools I could find
(even if they are using threads) seem to call the CommandLine parser at the
start so they all execute the initial parser setup at a point where there is
only one thread. So once the code above is executed they are mostly safe from
the sanity check crashes. We even have some shady code for the gtest `main` in
`TestMain.cpp` which is why this also doesn't affect unit tests.
The only exception to this rule is ... *drum roll* ... LLDB! it's not using that
CommandLine library for parsing options so it also never ends up calling it in
`main`. So when we end up in the `FrontendAction` code from the backtrace we are
already very deep in some LLDB logic and usually already have several threads.
In a situation where Swift decides to compile a large amount of Clang modules in
parallel we then end up entering this code via several threads. If several
threads reach this code at the same time we end up in the situation where the
sanity-checking code of CommandLine crashes. I have a very reliable way of
demonstrating the whole thing in D99650 (just run the unit test several times,
it usually crashes after 3-4 attempts).
We have several ways to fix this:
1. Make the whole CommandLine mechanism in LLVM thread-safe.
2. Get rid of `setCommandLineOpts` in `BackendUtil.cpp` and other callers of the
command line parsing in generic Clang code.
3. Initialise the CommandLine library in a safe point in LLDB.
Option 1 is just a lot of work and I'm not even sure where to start. The whole
mechanism is based on global variables and global state and this seems like a
humongous task.
Option 2 is probably the best thing we can do in the near future. There are only
two callers of the command line parser in generic Clang code. The one in
`BackendUtils.cpp` looks like it can be replaced with some reasonable
refactoring (as it only deals with two specific options). There is another one
in `ExecuteCompilerInvocation` which deals with forwarding the generic `-mllvm`
options to the backend which seems like it will just end up requiring us to do
Option 1.
Option 3 is what this patch is doing. We just parse some dummy command line
invocation in a point of the LLDB execution where we only have one thread that
is dealing with LLVM/Clang stuff. This way we are at least prevent the frequent
crashes for users as parsing the dummy command line invocation will set up the
initial parser state safely.
Fixes rdar://70989856
Reviewed By: mib, JDevlieghere
Differential Revision: https://reviews.llvm.org/D99652
The ObjC runtime offers both signed & unsigned tagged pointer value
accessors to tagged pointer providers, but lldb's tagged pointer
code only implemented the unsigned one. This patch adds an
emulation of the signed one.
The motivation for doing this is that NSNumbers use the signed
accessor (they are always signed) and we need to follow that in our
summary provider or we will get incorrect values for negative
NSNumbers.
The data-formatter-objc test file had NSNumber examples (along with lots of other
goodies) but the NSNumber values weren't tested. So I also added
checks for those values to the test.
I also did a quick audit of the other types in that main.m file, and
it looks like pretty much all the other values are either intermediates
or are tested.
Differential Revision: https://reviews.llvm.org/D99694
Calling `ParseCommandLineOptions` should only be called from `main` as the
CommandLine setup code isn't thread-safe. As BackendUtil is part of the
generic Clang FrontendAction logic, a process which has several threads executing
Clang FrontendActions will randomly crash in the unsafe setup code.
This patch avoids calling the function unless either the debug-pass option or
limit-float-precision option is set. Without these two options set the
`ParseCommandLineOptions` call doesn't do anything beside parsing
the command line `clang` which doesn't set any options.
See also D99652 where LLDB received a workaround for this crash.
Reviewed By: JDevlieghere
Differential Revision: https://reviews.llvm.org/D99740
Without this patch, we'd always try to codesign the first argument in
the command line, which in some cases is not something we can codesign
(e.g. `bash` for some .sh.cpp tests).
Note that this "hack" is the same thing we do in `ssh.py` - we might need
to admit that it's not a hack after all in the future, but I'm not ready
for that yet.
Differential Revision: https://reviews.llvm.org/D99726
We need to splat the scalar separately and use .vv, but there is
no vmsgt(u).vv. So add isel patterns to select vmslt(u).vv with
swapped operands.
We also need to get VT to use for the splat from an operand rather
than the result since the result VT is nxvXi1.
Reviewed By: HsiangKai
Differential Revision: https://reviews.llvm.org/D99704
There's no target independent ISD opcode for MULHSU, so custom
legalize 2*XLen multiplies ourselves. We have to be a little
careful to prefer MULHU or MULHSU.
I thought about doing this in isel by pattern matching the
(add (mul X, (srai Y, XLen-1)), (mulhu X, Y)) pattern. I decided
against this because the add might become part of a chain of adds.
I don't trust DAG combine not to reassociate with other adds making
it difficult to find both pieces again.
Reviewed By: asb
Differential Revision: https://reviews.llvm.org/D99479
Respect --apple-sdk <path> if it's specified. If the SDK is simply
mounted from some disk image, and not actually installed, this is the
only way to use it.
Differential Revision: https://reviews.llvm.org/D99746
Doing this during instruction selection avoids the cost of running
SIAddIMGInit which is yet another pass over the MIR.
Differential Revision: https://reviews.llvm.org/D99670
Doing this in a post-isel hook avoids the cost of running SIAddIMGInit
which is yet another pass over the MIR.
Differential Revision: https://reviews.llvm.org/D99747
These variables were introduced during early work on the runtimes build
but were obsoleted by {LIBCXX,LIBCXXABI,LIBUNWIND}_INSTALL_LIBRARY_DIR.
Differential Revision: https://reviews.llvm.org/D99697
Debugging tests sometimes involves debugging the Python source. This adds a paragraph to
the "Debugging Test Failures" section about using `pdb`, and also describes how to run
lldb commands from pdb.
Differential Revision: https://reviews.llvm.org/D99744
This adds a new integer materialization strategy mainly targeted
at 64-bit constants like 0xffffffff where there are 32 or more trailing
ones with leading zeros. We can materialize these by using an addi -1
and srli to restore the leading zeros. This matches what gcc does.
I haven't limited to just these cases though. The implementation
here takes the constant, shifts out all the leading zeros and
shifts ones into the LSBs, creates the new sequence, adds an srli,
and checks if this is shorter than our original strategy.
I've separated the recursive portion into a standalone function
so I could append the new strategy outside of the recursion. Since
external users are no longer using the recursive function, I've
cleaned up the external interface to return the sequence instead of
taking a vector by reference.
Reviewed By: asb
Differential Revision: https://reviews.llvm.org/D98821
We can't see how much overhead/redundancy is being
created with the partial checks.
To make it smaller and easier to read, I reduced the
vectorization factor because that does not add new
information - it just duplicates things.
Support deriving dereferenceability facts from allocation sites with known object sizes while correctly accounting for any possibly frees between allocation and use site. (At the moment, we're conservative and only allowing it in functions where we know we can't free.)
This is part of the work on deref-at-point semantics. I'm making the change unconditional as the miscompile in this case is way too easy to trip by accident, and the optimization was only recently added (by me).
There will be a follow up patch wiring through TLI since that should now be doable without introducing widespread miscompiles.
Differential Revision: https://reviews.llvm.org/D95815
The main part of the patch is the change in RegAllocGreedy.cpp: Q.collectInterferringVregs()
needs to be called before iterating the interfering live ranges.
The rest of the patch offers support that is the case: instead of clearing the query's
InterferingVRegs field, we invalidate it. The clearing happens when the live reg matrix
is invalidated (existing triggering mechanism).
Without the change in RegAllocGreedy.cpp, the compiler ices.
This patch should make it more easily discoverable by developers that
collectInterferringVregs needs to be called before iterating.
I will follow up with a subsequent patch to improve the usability and maintainability of Query.
Differential Revision: https://reviews.llvm.org/D98232
Set the source ranges for parsed GNU-style attributes in
ParseGNUAttributes(), the same way that ParseCXX11Attributes() does it.
Differential Revision: https://reviews.llvm.org/D75844
- This patch adds in support to accept the "#" character as part of an Identifier.
- This support is needed especially for the HLASM dialect since "#" is treated as part of the valid "Alphabet" range
- The way this is done is by making use of the previous precedent set by the `AllowAtInIdentifier` field in `MCAsmLexer.h`. A new field called `AllowHashInIdentifier` is introduced.
- The static function `IsIdentifierChar` is also updated to accept the `#` character if the `AllowHashInIdentifier` field is set to true.
Note: The field introduced in `MCAsmLexer.h` could very well be moved to `MCAsmInfo.h`. I'm not opposed to it. I decided to put it in `MCAsmLexer` since there seems to be some sort of precedent already with `AllowAtInIdentifier`.
Reviewed By: abhina.sreeskantharajan, nickdesaulniers, MaskRay
Differential Revision: https://reviews.llvm.org/D99277
When an SVE function calls another SVE function using the C calling
convention we use the more efficient SVE VectorCall PCS. However,
for the Fast calling convention we're incorrectly falling back to
the generic AArch64 PCS.
This patch adds the same "can use SVE vector calling convention"
detection used by CallingConv::C to CallingConv::Fast.
Co-authored-by: Paul Walker <paul.walker@arm.com>
Differential Revision: https://reviews.llvm.org/D99657
1. Need to cleanup InstrElementSize map for each new tree, otherwise might
use sizes from the previous run of the vectorization attempt.
2. No need to include into analysis the instructions from the different basic
blocks to save compile time.
Differential Revision: https://reviews.llvm.org/D99677
If the inner shuffle already contains undef elements, then accept them in the merged shuffle as well.
This helps some X86 HADD/SUB patterns where slow targets were ending up with HADD/SUB because the (un)merged shuffles were stuck either side of the ADD/SUB - meaning we ended up with a total cost much higher than the "2*shuffle+add" that a slow target usually expands a HADD/SUB to.