Adjust `ShouldAutoContinue` to be available to any thread plan previous to the plan that
explains a stop, not limited to the parent to the plan that explains the stop.
Before this change, `Thread::ShouldStop` did the following:
1. find the plan that explains the stop
2. if it's not a master plan, continue processing previous (aka parent) plans
3. first, call `ShouldAutoContinue` on the immediate parent of the explaining plan
4. then loop over previous plans, calling `ShouldStop` and `MischiefManaged`
Of note, the iteration in step 4 does not call `ShouldAutoContinue`, so again only the
plan just prior to the explaining plan is given the opportunity to override whether to
continue or stop.
This commit changes the loop call `ShouldAutoContinue`, giving each plan the opportunity
to override `ShouldStop` of previous plans.
Why? This allows a plan to do the following:
1. mark itself done and be popped off the stack
2. allow parent plans to finish their work, and to also be popped off the stack
3. and finally, have the thread continue, not stop
This is useful for stepping into async functions. A plan will would step far enough
enough to set a breakpoint on the async target, and then use `ShouldAutoContinue` to
unwind the necessary stepping, and then have the calling thread continue.
Differential Revision: https://reviews.llvm.org/D97076
This makes the symlinks work properly on windows.
A similar round of cleanup was done in
c41bda7f5f, but these tests were
added after that.
Differential Revision: https://reviews.llvm.org/D97089
The spec doesn't declare it as an enum class, and being declared
as an enum class breaks referring to the values as e.g.
path::auto_format.
Differential Revision: https://reviews.llvm.org/D97084
This can reduce the binary size because counters will no longer occupy
space in the binary, instead they will be allocated by dynamic linker.
Differential Revision: https://reviews.llvm.org/D97110
Diagnose the problem in templates in the context of the template
declaration instead of in the context of all of the (possibly very many)
template instantiations.
Differential Revision: https://reviews.llvm.org/D96224
When one of the inputs is a wrapping range, intersect with the
union of the two inputs. The union of the two inputs corresponds
to the result we would get if we treated the min/max as a simple
select.
This fixes PR48643.
Follow-up to:
D96648 / b40fde062
...for the special-case base calls.
From the earlier commit:
This is unusual in the general (non-reciprocal) case because we need
an extra instruction, but that should be better for general FP
reassociation and codegen. We conservatively check for "arcp" FMF
here as we do with existing fdiv folds, but it is not strictly
necessary to have that.
We don't need any special handling for wrapping ranges (or empty
ranges for that matter). The sub() call will already compute a
correct and precise range.
We only need to adjust the test expectation: We're now computing
an optimal result, rather than an unsigned envelope.
This adds the IR for this C code
int32_t foo(uint16_t x, int16_t y) {
x %= y;
return x;
}
Note the dividend is unsigned and the divisor is signed. C type
promotion rules will extend them and use a 32-bit srem and the
function returns a 32-bit result.
We fail to use remw for this case. The zero extended input has
enough sign bits, but we won't consider (i64 AssertZext X, i16) in
the sexti32 isel pattern.
We also end up with a extra shifts to zero upper bits on the result.
computeKnownBits knew the result was positive before type legalization
and allowed the SIGN_EXTEND to become ZERO_EXTEND. But after promoting
to i64 we no longer know that bit 31 (and all bits above it) should
be 0.
`sm_35` is the minimum requirement for OpenMP offloading on NVPTX device.
Current driver test case is using `sm_20`. D97003 is going to switch the minimum
CUDA version to 9.2, which only supports `sm_30+`. This patch makes step for the
change.
Reviewed By: JonChesterfield
Differential Revision: https://reviews.llvm.org/D97120
When compiling UEFI applications, the main function is named
efi_main() instead of main(). Let's exclude efi_main() from
-Wmissing-prototypes as well to avoid warnings when working
on UEFI applications.
Differential Revision: https://reviews.llvm.org/D95746
When the optimality check fails, print the inputs, the computed
range and the better range that was found. This makes it much
simpler to identify the cause of the failure.
Make sure that full ranges (which, unlikely all the other cases,
have multiple ways to construct them that all result in the same
range) only print one message by handling them separately.
See discussion on https://reviews.llvm.org/D93263
-flat_namespace isn't implemented yet, and neither is -undefined dynamic,
so this makes -undefined pretty pointless in lld/MachO for now. But once
we implement -flat_namespace (which we need to do anyways to get check-llvm
to pass with lld as host linker), the code's already there.
Follow-up to https://reviews.llvm.org/D93263#2491865
Differential Revision: https://reviews.llvm.org/D96963
Refines the fix in 3c4c205060 to only
put globals whose defs were cloned into the split regular LTO module
on the cloned llvm*.used globals. This avoids an issue where one of the
attached values was a local that was promoted in the original module
after the module was cloned. We only need to have the values defined in
the new module on those globals.
Fixes PR49251.
Differential Revision: https://reviews.llvm.org/D97013
This reverts commit 9148302a (2019-08-22) which broke the pre-existing
unit test for the matcher. Also revert commit 518b2266 (Fix the
nullPointerConstant() test to get bots back to green., 2019-08-22) which
incorrectly changed the test to expect the broken behavior.
Differential Revision: https://reviews.llvm.org/D96665
This patch extends the support for RVV EXTRACT_SUBVECTOR to cover those
which don't align to a vector register boundary. It accomplishes this by
extracting the nearest register-sized subvector (a subregister
operation), then sliding the vector down with VSLIDEDOWN and extracting
the subvector from the first position (a COPY operation).
Since this procedure involves the use of VSCALE and multiplication, the
handling of such operations is done during lowering to simplify the
implementation and make use of DAG combining. This necessitated moving
some helper functions from RISCVISelDAGToDAG to RISCVTargetLowering.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D96959
With vector mask registers only allocatable to V0 (VMV0Regs) it is
relatively simple to generate code which uses multiple masks and naively
requires spilling.
This patch aims to improve codegen in such cases by telling LLVM it can
use VRRegs to hold masks. This will prevent spilling in many cases by
having LLVM copy to an available VR register.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D97055
recognizeBSwapOrBitReverseIdiom + collectBitParts have pattern matching to bail out early if a bswap/bitreverse pattern isn't possible - we should be able to rely on this instead without any notable change in compile time.
This is part of a cleanup towards letting matchBSwapOrBitReverse /recognizeBSwapOrBitReverseIdiom use 'root' instructions that aren't ORs (FSHL/FSHRs in particular which can be prematurely created).
Differential Revision: https://reviews.llvm.org/D97056
The current infrastructure for exhaustive ConstantRange testing is
somewhat confusing in what exactly it tests and currently cannot even
be used for operations that produce smallest-size results, rather than
signed/unsigned envelopes.
This patch makes the testing more principled by collecting the exact
set of results of an operation into a bit set and then comparing it
against the range approximation by:
* Checking conservative correctness: All elements in the set must be
in the range.
* Checking optimality under a given preference function: None of the
(slack-free) ranges that can be constructed from the set are
preferred over the computed range.
Implemented preference functions are:
* PreferSmallest: Smallest range regardless of signed/unsigned wrapping
behavior. Probably what we would call "optimal" without further
qualification.
* PreferSmallestUnsigned/Signed: Smallest range that has no
unsigned/signed wrapping. We use this if our calculation is precise
only up to signed/unsigned envelope.
* PreferSmallestNonFullUnsigned/Signed: Smallest range that has no
unsigned/signed wrapping -- but preferring a smaller wrapping range
over a (non-wrapping) full range. We use this if we have a fully
precise calculation but apply a sign preference to the result
(union/intersection). Even with a sign preference, returning a
wrapping range is still "strictly better" than returning a full one.
This also addresses PR49273 by replacing the fragile manual range
construction logic in testBinarySetOperationExhaustive() with generic
code that isn't specialized to the particular form of ranges that set
operations can produces.
Differential Revision: https://reviews.llvm.org/D88356
In semi-automated environments, XFAILing or filtering out known regressions without actually committing changes or temporarily modifying the test suite can be quite useful.
Reviewed By: yln
Differential Revision: https://reviews.llvm.org/D96662
This is a minor pattern-match update to BPFAdjustOpt.cpp to accept
not only 'or i1 a, b' but also 'select i1 a, i1 true, i1 b'.
This resolves regression after SimplifyCFG's creating select form
of and/or instead (https://reviews.llvm.org/D95026).
This is a small change, and currently such select form isn't created
or doesn't reach to the late pipeline (because InstCombine eagerly
folds it into and/or i1), so I chose to commit without a review process.
These don't seem to have any function in the test.
The non_regular_file one seems to have been added in
0f8c8f59df, without any apparent
purpose there.
Differential Revision: https://reviews.llvm.org/D97083