Summary:
It turns out that the order in which we provide completions for expressions is
nondeterministic. This leads to confusing user experience and also breaks the
reproducer tests (as two LLDB tests can go out of sync due to the
non-determinism in the completion lists)
The reason for the non-determinism is that the CompletionConsumer informs us
about decls in the order in which it finds declarations in the lookup store of
the DeclContexts it visits (mainly this snippet in SemaLookup.cpp):
``` lang=c++
// Enumerate all of the results in this context.
for (DeclContextLookupResult R :
Load ? Ctx->lookups()
: Ctx->noload_lookups(/*PreserveInternalState=*/false)) {
[...]
```
This storage of the lookup is sorted by pointer values (see the hash of
`DeclarationName`) and can therefore be non-deterministic. The LLDB code
completion consumer that receives these calls originally expected that the order
of declarations is defined by Clang, but it seems the API expects the client to
provide an order to the completions.
This patch fixes the issue as follows:
* We sort the completions we get from Clang alphabetically and also by the
priority value we get from Clang (with priority value sorting having precedence
over the alphabetical sorting)
* We make all the functions/variables that touch a completion before the sorting
const-qualified. The idea is that this should prevent that we never have
observable side-effect from touching these declarations in a non-deterministic
order (e.g., we don't try to complete the type by accident).
This way we behave like the other parts of Clang which also sort the results by
some deterministic value (usually the name or something computed from a name,
e.g., edit distance to a given string).
We most likely also need to fix the Clang code to make the loop I listed above
deterministic to prevent these issues in the future (tracked in rdar://63442513
). This wouldn't replace the functionality provided in this patch though as we
would still need the priority and overall alphabetical sorting.
Note: I had to increase the lldb-vscode completion limit to 100 as the tests
look for strings that aren't in the first 50 results anymore due to variable
names starting with letters like 'v' (which are now always shown much further
down in the list due to the alphabetical sorting).
Fixes rdar://63200995
Reviewers: JDevlieghere, clayborg
Reviewed By: JDevlieghere
Subscribers: mgrang, abidh
Differential Revision: https://reviews.llvm.org/D80292
ffmpeg/libavcodec/x86/h264_cabac.c inline assembly may produce
movzb 1280(%rbx, %r12), %r12
After D80608, llvm-mc errors:
error: unknown use of instruction mnemonic without a size suffix
This allocation of a workgroup memory is lowered to a
spv.globalVariable. Only static size allocation with element type
being int or float is handled. The lowering does account for the
element type that are not supported in the lowered spv.module based on
the extensions/capabilities and adjusts the number of elements to get
the same byte length.
Differential Revision: https://reviews.llvm.org/D80411
Summary:
When working with the DDG it's useful to be able to query details of the
memory dependencies between two nodes connected by a memory edge. The DDG
does not hold a copy of the dependencies, but it contains a reference to a
DependenceInfo object through which dependence information can be queried.
This patch adds a query function to the DDG to obtain all the Dependence
objects that exist between instructions of two nodes.
Authored By: bmahjour
Reviewers: Meinersbur, Whitney, etiotto
Reviewed By: Whitney
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D80529
Summary:
This includes a basic implementation for the OpenMP parallel
operation without a custom pretty-printer and parser.
The if, num_threads, private, shared, first_private, last_private,
proc_bind and default clauses are included in this implementation.
Currently the reduction clause is omitted as it is more complex and
requires analysis to see if we can share implementation with the loop
dialect. The allocate clause is also omitted.
A discussion about the design of this operation can be found here:
https://llvm.discourse.group/t/openmp-parallel-operation-design-issues/686
The current OpenMP Specification can be found here:
https://www.openmp.org/wp-content/uploads/OpenMP-API-Specification-5.0.pdf
Co-authored-by: Kiran Chandramohan <kiran.chandramohan@arm.com>
Reviewers: jdoerfert
Subscribers: mgorny, yaxunl, kristof.beyls, guansong, mehdi_amini, rriddle, jpienaar, shauheen, antiagainst, nicolasvasilache, arpith-jacob, mgester, lucyrfox, aartbik, liufengdb, Joonsoo, grosul1, frgossen, Kayjukh, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D79410
In the current statepoint design, we have four distinct groups of operands to the call: call args, gc transition args, deopt args, and gc args. This format prexisted the support in IR for operand bundles and was in fact one of the inspirations for the extension. However, we never went back and rearchitected statepoints to fully leverage bundles.
This change is the first in a small sequence to do so. All this does is extend the SelectionDAG lowering code to allow deopt and gc transition operands to be specified in either inline argument bundles or operand bundles.
Differential Revision: https://reviews.llvm.org/D8059
Unlike other platforms using ItaniumCXXABI, Darwin does not allow the
creation of a thread-wrapper function for a variable in the TU of
users. Because of this, it can set the linkage of the thread-local
symbol to internal, with the assumption that no TUs other than the one
defining the variable will need it.
However, constinit thread_local variables do not require the use of
the thread-wrapper call, so users reference the variable
directly. Thus, it must not be converted to internal, or users will
get a link failure.
This was a regression introduced by the optimization in
00223827a9.
Differential Revision: https://reviews.llvm.org/D80417
Take advantage of equality constrains to generate the type inference interface.
This is used for equality and trivially built types. The type inference method
is only generated when no type inference trait is specified already.
This reorders verification that changes some test error messages.
Differential Revision: https://reviews.llvm.org/D80484
With this change it is be possible to write FileCheck expressions such
as [[#(VAR+1)-2]]. Currently, the only supported arithmetic operators are
plus and minus, so this is not particularly useful yet. However, it our
CHERI fork we have tests that benefit from having multiplication in
FileCheck expressions. Allowing parenthesized expressions is the simplest
way for us to work around the current lack of operator precedence in
FileCheck expressions.
Reviewed By: thopre, jhenderson
Differential Revision: https://reviews.llvm.org/D77383
Add ThreadClock:: global_acquire_ which is the last time another thread
has done a global acquire of this thread's clock.
It helps to avoid problem described in:
https://github.com/golang/go/issues/39186
See test/tsan/java_finalizer2.cpp for a regression test.
Note the failuire is _extremely_ hard to hit, so if you are trying
to reproduce it, you may want to run something like:
$ go get golang.org/x/tools/cmd/stress
$ stress -p=64 ./a.out
The crux of the problem is roughly as follows.
A number of O(1) optimizations in the clocks algorithm assume proper
transitive cumulative propagation of clock values. The AcquireGlobal
operation may produce an inconsistent non-linearazable view of
thread clocks. Namely, it may acquire a later value from a thread
with a higher ID, but fail to acquire an earlier value from a thread
with a lower ID. If a thread that executed AcquireGlobal then releases
to a sync clock, it will spoil the sync clock with the inconsistent
values. If another thread later releases to the sync clock, the optimized
algorithm may break.
The exact sequence of events that leads to the failure.
- thread 1 executes AcquireGlobal
- thread 1 acquires value 1 for thread 2
- thread 2 increments clock to 2
- thread 2 releases to sync object 1
- thread 3 at time 1
- thread 3 acquires from sync object 1
- thread 1 acquires value 1 for thread 3
- thread 1 releases to sync object 2
- sync object 2 clock has 1 for thread 2 and 1 for thread 3
- thread 3 releases to sync object 2
- thread 3 sees value 1 in the clock for itself
and decides that it has already released to the clock
and did not acquire anything from other threads after that
(the last_acquire_ check in release operation)
- thread 3 does not update the value for thread 2 in the clock from 1 to 2
- thread 4 acquires from sync object 2
- thread 4 detects a false race with thread 2
as it should have been synchronized with thread 2 up to time 2,
but because of the broken clock it is now synchronized only up to time 1
The global_acquire_ value helps to prevent this scenario.
Namely, thread 3 will not trust any own clock values up to global_acquire_
for the purposes of the last_acquire_ optimization.
Reviewed-in: https://reviews.llvm.org/D80474
Reported-by: nvanbenschoten (Nathan VanBenschoten)
Now that OpBuilder is available in `build` functions, it becomes possible to
populate the "then" and "else" regions directly when building the "if"
operation. This is desirable in more structured forms of builders, especially
in when conditionals are mixed with loops. Provide new `build` APIs taking
callbacks for body constructors, similarly to scf::ForOp, and replace more
clunky edsc::BlockBuilder uses with these. The original APIs remain available
and go through the new implementation.
Differential Revision: https://reviews.llvm.org/D80527
Some testcases are unexpectedly passing with NPM.
This is because the target functions are inlined in NPM.
I think we should add noinline attribute to keep these test points.
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D79648
When the p_offset/p_filesz of the PT_GNU_EH_FRAME is invalid
(e.g larger than the file size) then llvm-readobj might crash.
This patch fixes the issue. I've introduced `ELFFile<ELFT>::getSegmentContent`
method, which is very similar to `ELFFile<ELFT>::getSectionContentsAsArray` one.
Differential revision: https://reviews.llvm.org/D80380
This makes sure to correctly register the loop info of the children
of unroll and jammed loops. It re-uses some code from the unroller for
registering subloops.
Differential Revision: https://reviews.llvm.org/D80619
The vector equivalent has backwards operands, but the scalar version
does not. The passes that use these hooks aren't enabled by default,
so this doesn't really change anything.
Summary:
In Thumb2's frame index rewriting process, the address mode i8s4, which
is used by LDRD and STRD instructions, is handled by taking the
immediate offset operand and multiplying it by 4.
This behaviour is wrong, however. In this specific address mode, the
MachineInstr's immediate operand is already in the expected form. By
consequence of that, multiplying it once more by 4 yields a flawed
offset value, four times greater than it should be.
Differential Revision: https://reviews.llvm.org/D80557
Summary: `CommandObject::CheckRequirements` requires cleaning up `m_exe_ctx`
between commands. Function `HandleOptionCompletion` returns without cleaning up
`m_exe_ctx` could cause assert failure in later `CheckRequirements`.
Reviewers: teemperor, JDevlieghere
Reviewed By: teemperor
Tags: #lldb
Differential Revision: https://reviews.llvm.org/D80447
Summary:
Updating IR now that alignment is explicitly set.
This is a prerequisite to D80276.
Reviewers: efriedma
Subscribers: llvm-commits, craig.topper
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D80549
If it turns out that we can do runtime checks, but there are no
runtime-checks to generate, set RtCheck.Need to false.
This can happen if we can prove statically that the pointers passed in
to canCheckPtrAtRT do not alias. This should not change any results, but
allows us to skip some work and assert that runtime checks are
generated, if LAA indicates that runtime checks are required.
Reviewers: anemet, Ayal
Reviewed By: Ayal
Differential Revision: https://reviews.llvm.org/D79969
Note: This is a recommit of 259abfc7cb,
with some suggested renaming.
If it turns out that we can do runtime checks, but there are no
runtime-checks to generate, set RtCheck.Need to false.
This can happen if we can prove statically that the pointers passed in
to canCheckPtrAtRT do not alias. This should not change any results, but
allows us to skip some work and assert that runtime checks are
generated, if LAA indicates that runtime checks are required.
Reviewers: anemet, Ayal
Reviewed By: Ayal
Differential Revision: https://reviews.llvm.org/D79969
Currently if instructions defined in a block are used in unreachable
blocks and SimpleLoopUnswitch attempts deleting the block, it triggers
assertion "Uses remain when a value is destroyed!".
This patch fixes it by replacing all uses of instructions from BB with
undefs before BB deletion.
Reviewed By: asbirlea
Differential Revision: https://reviews.llvm.org/D80551
As was mentioned in review comments for D80204,
`printHashHistogram` has 2 lambdas that are probably too large
and deserves splitting into member functions.
This patch does it.
Differential revision: https://reviews.llvm.org/D80546
When the `--elf-hash-histogram` is used, the code first tries to build
a histogram for the .hash table and then for the .gnu.hash table.
The problem is that dumper might return early when unable or do not need to
build a histogram for the .hash.
This patch reorders the code slightly to fix the issue and adds a test case.
Differential revision: https://reviews.llvm.org/D80204
Summary:
For ObjCInterfaceDecls, LLDB iterates over the `methods` of the interface in FindExternalVisibleDeclsByName
since commit ef423a3ba5 .
However, when LLDB calls `oid->methods()` in that function, Clang will pull in all declarations in the current
DeclContext from the current ExternalASTSource (which is again, `ClangExternalASTSourceCallbacks`). The
reason for that is that `methods()` is just a wrapper for `decls()` which is supposed to provide a list of *all*
(both currently loaded and external) decls in the DeclContext.
However, `ClangExternalASTSourceCallbacks::FindExternalLexicalDecls` doesn't implement support for ObjCInterfaceDecl,
so we don't actually add any declarations and just mark the ObjCInterfaceDecl as having no ExternalLexicalStorage.
As LLDB uses the ExternalLexicalStorage to see if it can complete a type with the ExternalASTSource, this causes
that LLDB thinks our class can't be completed any further by the ExternalASTSource
and will from on no longer make any CompleteType/FindExternalLexicalDecls calls to that decl. This essentially
renders those types unusable in the expression parser as they will always be considered incomplete.
This patch just changes the call to `methods` (which is just a `decls()` wrapper), to some ad-hoc `noload_methods`
call which is wrapping `noload_decls()`. `noload_decls()` won't trigger any calls to the ExternalASTSource, so
this prevents that ExternalLexicalStorage will be set to false.
The test for this is just adding a method to an ObjC interface. Before this patch, this unset the ExternalLexicalStorage
flag and put the interface into the state described above.
In a normal user session this situation was triggered by setting a breakpoint in a method of some ObjC class. This
caused LLDB to create the MethodDecl for that specific method and put it into the the ObjCInterfaceDecl.
Also `ObjCLanguageRuntime::LookupInCompleteClassCache` needs to be unable to resolve the type do
an actual definition when the breakpoint is set (I'm not sure how exactly this can happen, but we just
found no Type instance that had the `TypePayloadClang::IsCompleteObjCClass` flag set in its payload in
the situation where this happens. This however doesn't seem to be a regression as logic wasn't changed
from what I can see).
The module-ownership.mm test had to be changed as the only reason why the ObjC interface in that test had
it's ExternalLexicalStorage flag set to false was because of this unintended side effect. What actually happens
in the test is that ExternalLexicalStorage is first set to false in `DWARFASTParserClang::CompleteTypeFromDWARF`
when we try to complete the `SomeClass` interface, but is then the flag is set back to true once we add
the last ivar of `SomeClass` (see `SetMemberOwningModule` in `TypeSystemClang.cpp` which is called
when we add the ivar). I'll fix the code for that in a follow-up patch.
I think some of the code here needs some rethinking. LLDB and Clang shouldn't infer anything about the ExternalASTSource
and its ability to complete the current type form the `ExternalLexicalStorage` flag. We probably should
also actually provide any declarations when we get asked for the lexical decls of an ObjCInterfaceDecl. But both of those
changes are bigger (and most likely would cause us to eagerly complete more types), so those will be follow up patches
and this patch just brings us back to the state before commit ef423a3ba5 .
Fixes rdar://63584164
Reviewers: aprantl, friss, shafik
Reviewed By: aprantl, shafik
Subscribers: arphaman, abidh, JDevlieghere
Differential Revision: https://reviews.llvm.org/D80556
If we are using PTEST to check 'allsign bits' vector elements we can use MOVMSK to extract the signbits directly and perform the comparison on the scalar value.
For vXi16 cases, as we don't have a MOVMSK for this type, we must mask each signbit out of a PMOVMSKB v2Xi8 result, which folds into the TEST comparison.
If this allows us to remove a vector op (via the SimplifyMultipleUseDemandedBits call) this is consistently faster than a PTEST (https://godbolt.org/z/ziJUst).
I'm investigating whether we ever get regressions without the SimplifyMultipleUseDemandedBits call, even if this means we don't remove a vector op, but that has exposed some other poor codegen issues that I'm still investigating and would have to wait for a later patch.
Suggested on PR42035 to avoid unnecessary ashr(x,bw-1)/pcmpgt(0,x) sign splat patterns feeding into ptest.
Differential Revision: https://reviews.llvm.org/D80563
Summary:
Previously, we only added early-clobber flags to the 'group' immediate flag operand
of an inline asm operand.
However, we also have to add the EarlyClobber flag to the MachineOperand itself.
This fixes PR46028
Reviewers: arsenm, leonardchan
Reviewed By: arsenm, leonardchan
Subscribers: phosek, wdng, rovka, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D80467