The challenge with measuring time in tests is that slow and/or busy
machines can cause tests to fail in unexpected ways. After this change,
three tests should be much more robust. The only remaining and tiny race
that I can think of is preemption after `--countDown`. That being said,
the race isn't fixable because the standard library doesn't provide a
way to count threads that are waiting to acquire a lock.
Reviewers: ldionne, EricWF, howard.hinnant, mclow.lists, #libc
Reviewed By: ldionne, #libc
Subscribers: dexonsmith, jfb, broadwaylamb, libcxx-commits
Tags: #libc
Differential Revision: https://reviews.llvm.org/D79406
For the sint_to_fp(and(X,C)) -> and(X,sint_to_fp(C)) fold, allow combineVectorCompareAndMaskUnaryOp to match any X that ComputeNumSignBits says is all-bits, not just SETCC.
Noticed while investigating mask promotion issues in PR45808
For the sint_to_fp(and(X,C)) -> and(X,sint_to_fp(C)) fold, combineVectorCompareAndMaskUnaryOp only matches X against SETCC (with an all-bits result) when really it could accept anything that ComputeNumSignBits says is all-bits.
Noticed while investigating mask promotion issues in PR45808
D55859 and D63339 prevented needless dependencies on system symbol
files. This testcase was checked-in afterwards and it brings back one
such unwanted dependency. Under some circumstances it may cause false
FAILs and/or excessive resource usage to run the testcase.
clang-format does not support .py so I have formatted it as I found most
compatible.
Also this is not a full testcase-style initialization, for example
--no-lldbinit ignores env("NO_LLDBINIT") setting which lldbtest.py does
implement:
# If we spawn an lldb process for test (via pexpect), do not load the
# init file unless told otherwise.
if os.environ.get("NO_LLDBINIT") != "NO":
self.lldbOption += " --no-lldbinit"
But this is what lldbpexpect.py does - it also ignores
env("NO_LLDBINIT"). Sure one could also fix lldbpexpect.py to unify the
initialization more with lldbtest.py but I find that outside of the
scope of this patch.
Differential Revision: https://reviews.llvm.org/D79649
It looks like that was an initial intention, but some code paths in
`DWARFExpression::Operation::extract()` did not initialize `EndOffset`
properly.
Differential Revision: https://reviews.llvm.org/D79622
Previously we implemented non-standard disambiguation rules to
distinguish an enum-base from a bit-field but otherwise treated a :
after an elaborated-enum-specifier as introducing an enum-base. That
misparses various examples (anywhere an elaborated-type-specifier can
appear followed by a colon, such as within a ternary operator or
_Generic).
We now implement the C++11 rules, with the old cases accepted as
extensions where that seemed reasonable. These amount to:
* an enum-base must always be accompanied by an enum definition (except
in a standalone declaration of the form 'enum E : T;')
* in a member-declaration, 'enum E :' always introduces an enum-base,
never a bit-field
* in a type-specifier (or similar context), 'enum E :' is not
permitted; the colon means whatever else it would mean in that
context.
Fixed underlying types for enums are also permitted in Objective-C and
under MS extensions, plus as a language extension in all other modes.
The behavior in ObjC and MS extensions modes is unchanged (but the
bit-field disambiguation is a bit better); remaining language modes
follow the C++11 rules.
Fixes PR45726, PR39979, PR19810, PR44941, and most of PR24297, plus C++
core issues 1514 and 1966.
Linkage type was only referenced for functions, not for global
variables.
Clarify that LLVM doesn't make assumption about the allocation size when
no definitive initializer for a global variable is known.
Differential Revision: https://reviews.llvm.org/D78952
This patch stores the alignment for ConstantPoolSDNode as an
Align and updates the getConstantPool interface to take a MaybeAlign.
Removing getAlignment() will be done as a follow up.
Differential Revision: https://reviews.llvm.org/D79436
This is mostly useful if alloca element type is not integer
and then casted to an integer for load or store. We now can
vectorize an [i32] alloca but cannot do so for [float].
There also a separate patch needed to properly lower 64 bit
types after they vectorized. At the moment these are lowered
via scratch anyway.
Differential Revision: https://reviews.llvm.org/D79641
Separate functions that require shared state into a class to avoid
needing to pass them though multiple functions just to be available
where needed.
The main motivation for this is that we would like to remove the
limitation that accumulator values be dynamic constant, which would
require additional shared state between call eliminations in the same
function, compounding this issue.
Differential Revision: https://reviews.llvm.org/D79299
I noticed that std::error_code() does one-time initialization. Avoid
that overhead with Expected<T> and llvm::Error. Also, it is consistent
with the virtual interface and ELF, and generally cleaner.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D79643
Summary:
Since the underlying wait and notify instructions are only available
when the atomics feature is enabled, it only makes sense to expose
their builtin functions when atomics are enabled.
Reviewers: aheejin, sunfish
Subscribers: dschuff, sbc100, jgravelle-google, jfb, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D79534
Summary:
The WebAssembly backend automatically lowers atomic operations and TLS
to nonatomic operations and non-TLS data when either are present and
the atomics or bulk-memory features are not present, respectively. The
resulting object is no longer thread-safe, so the linker has to be
told not to allow it to be linked into a module with shared
memory. This was previously done by disallowing the 'atomics' feature,
which prevented any objct with its atomic operations or TLS removed
from being linked with any object containing atomics or TLS, and
therefore preventing it from being linked into a module with shared
memory since shared memory requires atomics.
However, as of https://github.com/WebAssembly/threads/issues/144, the
validation rules are relaxed to allow atomic operations to validate
with unshared memories, which makes it perfectly safe to link an
object with stripped atomics and TLS with another object that still
contains TLS and atomics as long as the resulting module has an
unshared memory. To allow this kind of link, this patch disallows a
pseudo-feature 'shared-mem' rather than 'atomics' to communicate to
the linker that the object is not thread-safe. This means that the
'atomics' feature is available to accurately reflect whether or not an
object has atomics enabled.
As a drive-by tweak, this change also requires that bulk-memory be
enabled in addition to atomics in order to use shared memory. This is
because initializing shared memories requires bulk-memory operations.
Reviewers: aheejin, sbc100
Subscribers: dschuff, jgravelle-google, hiraditya, sunfish, jfb, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D79542
The code to prevent using `PPCXCOFFMCAsmInfo` with little-endian targets
used an incorrect check. Also, there does not appear to be sufficient
earlier checking to prevent failing this check, so the check here is
upgraded to be a `report_fatal_error`.
`PPCAIXAsmPrinter` was also missing a check against use with
little-endian targets. This patch adds such a check in.
Summary:
This patch introduces the constants defined to identify DWARF sections
in XCOFF into `llvm/BinaryFormat/XCOFF.h` and adds (NFC) placeholder
code to `llvm/lib/MC/MCObjectFileInfo.cpp` where the DWARF sections for
XCOFF are to be set up.
Reviewers: jasonliu, sfertile, daltenty, DiggerLin, Xiangling_L
Reviewed By: jasonliu, sfertile, DiggerLin
Differential Revision: https://reviews.llvm.org/D79220
Summary:
`AsmPrinter::emitGlobalIndirectSymbol` is dependent on
`MCStreamer::emitAssignment` to produce `.set` directives for alias
symbols; however, the `.set` pseudo-op on AIX is documented as not
usable with external relocatable terms or expressions, which limits its
applicability in generating alias symbols.
Disable generating aliases on AIX until a different implementation
strategy is available.
Reviewers: cebowleratibm, jasonliu, sfertile, daltenty, DiggerLin
Reviewed By: jasonliu
Differential Revision: https://reviews.llvm.org/D79044
This fixes a verifier failure on a bot:
http://green.lab.llvm.org/green/job/test-suite-verify-machineinstrs-aarch64-O0-g/
```
*** Bad machine code: MBB has duplicate entries in its successor list. ***
- function: foo
- basic block: %bb.5 indirectgoto (0x7fe3d687ca08)
```
One of the GCC torture suite tests (pr70460.c) has an indirectbr instruction
which has duplicate blocks in its destination list.
According to the langref this is allowed:
> Blocks are allowed to occur multiple times in the destination list, though
> this isn’t particularly useful.
(https://www.llvm.org/docs/LangRef.html#indirectbr-instruction)
We don't allow this in MIR. So, when we translate such an instruction, the
verifier screams.
This patch makes `translateIndirectBr` check if a successor has already been
added to a block. If the successor is present, it is skipped rather than added
twice.
Differential Revision: https://reviews.llvm.org/D79609
As with the extractelement patterns that are currently in vector-combine,
there are going to be several possible variations on this theme. This
should be the clearest, simplest example.
Scalarization is the right direction for target-independent canonicalization,
and InstCombine has some of those folds already, but it doesn't do this.
I proposed a similar transform in D50992. Here in vector-combine, we can
check the cost model to be sure it's profitable, so there should be less risk.
Differential Revision: https://reviews.llvm.org/D79452
Because LLDB isn't the one spawning the subprocess, the PID is different
during replay. Exclude it form the substring check during replay.
Depends on D79646 to pass with reproducer replay.
This commit should will break libc++ without local submodule visibility, but
the LLVM+modules bots are now all using this mode. Before the Green Dragon
LLDB bot was failing to compile with a libc++ built with this commit as LSV
was disabled on macOS.
Original summary:
libc++ is careful to not fracture overload sets. When one overload
is visible to a user, all of them should be. Anything less causes
subtle bugs and ODR violations.
Previously, in order to support ::abs and ::div being supplied by
both <cmath> and <cstdlib> we had to do awful things that make
<math.h> and <stdlib.h> have header cycles and be non-modular.
This really breaks with modules.
Specifically the problem was that in C++ ::abs introduces overloads
for floating point numbers, these overloads forward to ::fabs,
which are defined in math.h. Therefore ::abs needed to be in math.h
too. But this required stdlib.h to include math.h and math.h to
include stdlib.h.
To avoid these problems the definitions have been moved to stddef.h
(which math includes), and the floating point overloads of ::abs
have been changed to call __builtin_fabs, which both Clang and GCC
support.
Summary:
If the only use of a value is a start or end lifetime intrinsic then mark the intrinsic as trivially dead. This should allow for that value to then be removed as well.
Currently, this only works for allocas, globals, and arguments.
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D79355
As suggested in D79116 - there's shared logic between the
existing code and potential new folds. This could go in
ValueTracking if it seems generally useful.
Summary:
`CalculateSyntheticValue` and `GetSyntheticValue` have a `use_synthetic` parameter
that makes the function do nothing when it's false. We obviously always pass true
to the function (or check that the value we pass is true), because there really isn't
any point calling with function with a `false`. This just removes all of this.
Reviewers: labath, JDevlieghere, davide
Reviewed By: davide
Subscribers: davide
Differential Revision: https://reviews.llvm.org/D79568
This does the same thing as {vex2}. Which is give an error
if the instruction can't be done with VEX. It doesn't force
the instruction to use 2 byte VEX. That's already the preference
if its possible. Therefore {vex} is a clearer name.
them in a special text section.
For sampleFDO, because the optimized build uses profile generated from
previous release, previously we couldn't tell a function without profile
was truely cold or just newly created so we had to treat them conservatively
and put them in .text section instead of .text.unlikely. The result was when
we persuing the best performance by locking .text.hot and .text in memory,
we wasted a lot of memory to keep cold functions inside.
In https://reviews.llvm.org/D66374, we introduced profile symbol list to
discriminate functions being cold versus functions being newly added.
This mechanism works quite well for regular use cases in AutoFDO. However,
in some case, we can only have a partial profile when optimizing a target.
The partial profile may be an aggregated profile collected from many targets.
The profile symbol list method used for regular sampleFDO profile is not
applicable to partial profile use case because it may be too large and
introduce many false positives.
To solve the problem for partial profile use case, we provide an option called
--profile-unknown-in-special-section. For functions without profile, we will
still treat them conservatively in compiler optimizations -- for example,
treat them as warm instead of cold in inliner. When we use profile info to
add section prefix for functions, we will discriminate functions known to be
not cold versus functions without profile (being unknown), and we will put
functions being unknown in a special text section called .text.unknown.
Runtime system will have the flexibility to decide where to put the special
section in order to achieve a balance between performance and memory saving.
Differential Revision: https://reviews.llvm.org/D62540