Currently this code exists in widenScalar for G_MERGE_VALUE
sources. I'm not sure if the existing expansion in widenScalar should
be removed or not. The widenScalar variant tries to extend to the
requested size, but this just uses the original bitwidth.
G_BITCAST can be lowered with a pair of G_UNMERGE_VALUES and
G_MERGE_VALUES with different types, but G_UNMERGE_VALUES of a vector
can also be implemented with a bitcast to a scalar, which introduces
the possibility for infinite loops. Try to eliminate an illegal source
register type in the artifact combiner to avoid this from happening.
Avoids infinite looping in the legalizer in a future patch which
allows lowering G_UNMERGE_VALUES of a vector source with a G_BITCAST.
Consider any constant memory type, not just global constants. AMDGPU
kernel parameters are effectively global constants, but appear as
either reads from an intrinsic derived pointer or function argument.
Check the address space first before searching for the object
definition to save compile time. As an added bonus, this will now
treat casts to constant addrspace as constant.
We also seemed to be missing targeted tests for this, so add a few
missing other cases too.
Helps improve test coverage of the XOP modes in X86TargetLowering::isVectorShiftByScalarCheap (and where we always return false for vXi8 vector shifts).
This enables the register to be changed from XMM/YMM/ZMM0 to
instead match the other source. This prevents a false
dependency.
I added all the integer unpck instructions, but the tests
only show changes for BW and WD.
Unfortunately, we can have undef on operand 1 or 2 of the AVX
instructions. This breaks the interface with hasUndefRegUpdate
which used to tell which operand to check.
Now we scan the input operands looking for an undef register and
then ask hasUndefRegUpdate if its an instruction we care about
and which operands of that instruction we care about.
I also had to make some changes to the load folding code to
always pass operand 1 to hasUndefRegUpdate. I've updated
hasUndefRegUpdate to return false when ForLoadFold is set for
instructions that are not explicitly blocked for load folding in
isel patterns.
Differential Revision: https://reviews.llvm.org/D79615
Unlike Neon, MVE does not have a way of duplicating from a vector lane,
so a VDUPLANE currently selects to a VDUP(move_from_lane(..)). This
forces that to be done earlier as a dag combine to allow other folds to
happen.
It converts to a VDUP(EXTRACT). On FP16 this is then folded to a
VGETLANEu to prevent it from creating a vmovx;vmovhr pair, using a
single move_from_reg instead.
Differential Revision: https://reviews.llvm.org/D79606
MachProcess.mm uses a TARGET_OS_ macro without directly including
TargetConditionals.h. This currently works as we get the header
as an indirect dependency, but might not in the future.
I just spent some time investigating an internal regression
caused by a similar issue, so I audited the codebase for such
cases.
When intrinsic types are assigned there are some implicit conversions
that take place. This change make them explicit in the types
representation of assignments.
Differential Revision: https://reviews.llvm.org/D79637
Summary:
Sometimes in templated code Member references are reported as `DependentScopeMemberExpr` because that's what the standard dictates, however in many trivial cases it is easy to resolve the reference to its actual Member.
Take this code:
```
template<typename T>
class A{
int value;
A& operator=(const A& Other){
value = Other.value;
this->value = Other.value;
return *this;
}
};
```
When ran with `clang-tidy file.cpp -checks=readability-identifier-naming --config="{CheckOptions: [{key: readability-identifier-naming.MemberPrefix, value: m_}]}" -fix`
Current behaviour:
```
template<typename T>
class A{
int m_value;
A& operator=(const A& Other){
m_value = Other.value;
this->value = Other.value;
return *this;
}
};
```
As `this->value` and `Other.value` are Dependent they are ignored when creating the fix-its, however this can easily be resolved.
Proposed behaviour:
```
template<typename T>
class A{
int m_value;
A& operator=(const A& Other){
m_value = Other.m_value;
this->m_value = Other.m_value;
return *this;
}
};
```
Reviewers: aaron.ballman, JonasToth, alexfh, hokein, gribozavr2
Reviewed By: aaron.ballman
Subscribers: merge_guards_bot, xazax.hun, cfe-commits
Tags: #clang, #clang-tools-extra
Differential Revision: https://reviews.llvm.org/D73052
The challenge with measuring time in tests is that slow and/or busy
machines can cause tests to fail in unexpected ways. After this change,
three tests should be much more robust. The only remaining and tiny race
that I can think of is preemption after `--countDown`. That being said,
the race isn't fixable because the standard library doesn't provide a
way to count threads that are waiting to acquire a lock.
Reviewers: ldionne, EricWF, howard.hinnant, mclow.lists, #libc
Reviewed By: ldionne, #libc
Subscribers: dexonsmith, jfb, broadwaylamb, libcxx-commits
Tags: #libc
Differential Revision: https://reviews.llvm.org/D79406
For the sint_to_fp(and(X,C)) -> and(X,sint_to_fp(C)) fold, allow combineVectorCompareAndMaskUnaryOp to match any X that ComputeNumSignBits says is all-bits, not just SETCC.
Noticed while investigating mask promotion issues in PR45808
For the sint_to_fp(and(X,C)) -> and(X,sint_to_fp(C)) fold, combineVectorCompareAndMaskUnaryOp only matches X against SETCC (with an all-bits result) when really it could accept anything that ComputeNumSignBits says is all-bits.
Noticed while investigating mask promotion issues in PR45808
D55859 and D63339 prevented needless dependencies on system symbol
files. This testcase was checked-in afterwards and it brings back one
such unwanted dependency. Under some circumstances it may cause false
FAILs and/or excessive resource usage to run the testcase.
clang-format does not support .py so I have formatted it as I found most
compatible.
Also this is not a full testcase-style initialization, for example
--no-lldbinit ignores env("NO_LLDBINIT") setting which lldbtest.py does
implement:
# If we spawn an lldb process for test (via pexpect), do not load the
# init file unless told otherwise.
if os.environ.get("NO_LLDBINIT") != "NO":
self.lldbOption += " --no-lldbinit"
But this is what lldbpexpect.py does - it also ignores
env("NO_LLDBINIT"). Sure one could also fix lldbpexpect.py to unify the
initialization more with lldbtest.py but I find that outside of the
scope of this patch.
Differential Revision: https://reviews.llvm.org/D79649
It looks like that was an initial intention, but some code paths in
`DWARFExpression::Operation::extract()` did not initialize `EndOffset`
properly.
Differential Revision: https://reviews.llvm.org/D79622
Previously we implemented non-standard disambiguation rules to
distinguish an enum-base from a bit-field but otherwise treated a :
after an elaborated-enum-specifier as introducing an enum-base. That
misparses various examples (anywhere an elaborated-type-specifier can
appear followed by a colon, such as within a ternary operator or
_Generic).
We now implement the C++11 rules, with the old cases accepted as
extensions where that seemed reasonable. These amount to:
* an enum-base must always be accompanied by an enum definition (except
in a standalone declaration of the form 'enum E : T;')
* in a member-declaration, 'enum E :' always introduces an enum-base,
never a bit-field
* in a type-specifier (or similar context), 'enum E :' is not
permitted; the colon means whatever else it would mean in that
context.
Fixed underlying types for enums are also permitted in Objective-C and
under MS extensions, plus as a language extension in all other modes.
The behavior in ObjC and MS extensions modes is unchanged (but the
bit-field disambiguation is a bit better); remaining language modes
follow the C++11 rules.
Fixes PR45726, PR39979, PR19810, PR44941, and most of PR24297, plus C++
core issues 1514 and 1966.
Linkage type was only referenced for functions, not for global
variables.
Clarify that LLVM doesn't make assumption about the allocation size when
no definitive initializer for a global variable is known.
Differential Revision: https://reviews.llvm.org/D78952
This patch stores the alignment for ConstantPoolSDNode as an
Align and updates the getConstantPool interface to take a MaybeAlign.
Removing getAlignment() will be done as a follow up.
Differential Revision: https://reviews.llvm.org/D79436
This is mostly useful if alloca element type is not integer
and then casted to an integer for load or store. We now can
vectorize an [i32] alloca but cannot do so for [float].
There also a separate patch needed to properly lower 64 bit
types after they vectorized. At the moment these are lowered
via scratch anyway.
Differential Revision: https://reviews.llvm.org/D79641
Separate functions that require shared state into a class to avoid
needing to pass them though multiple functions just to be available
where needed.
The main motivation for this is that we would like to remove the
limitation that accumulator values be dynamic constant, which would
require additional shared state between call eliminations in the same
function, compounding this issue.
Differential Revision: https://reviews.llvm.org/D79299
I noticed that std::error_code() does one-time initialization. Avoid
that overhead with Expected<T> and llvm::Error. Also, it is consistent
with the virtual interface and ELF, and generally cleaner.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D79643
Summary:
Since the underlying wait and notify instructions are only available
when the atomics feature is enabled, it only makes sense to expose
their builtin functions when atomics are enabled.
Reviewers: aheejin, sunfish
Subscribers: dschuff, sbc100, jgravelle-google, jfb, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D79534
Summary:
The WebAssembly backend automatically lowers atomic operations and TLS
to nonatomic operations and non-TLS data when either are present and
the atomics or bulk-memory features are not present, respectively. The
resulting object is no longer thread-safe, so the linker has to be
told not to allow it to be linked into a module with shared
memory. This was previously done by disallowing the 'atomics' feature,
which prevented any objct with its atomic operations or TLS removed
from being linked with any object containing atomics or TLS, and
therefore preventing it from being linked into a module with shared
memory since shared memory requires atomics.
However, as of https://github.com/WebAssembly/threads/issues/144, the
validation rules are relaxed to allow atomic operations to validate
with unshared memories, which makes it perfectly safe to link an
object with stripped atomics and TLS with another object that still
contains TLS and atomics as long as the resulting module has an
unshared memory. To allow this kind of link, this patch disallows a
pseudo-feature 'shared-mem' rather than 'atomics' to communicate to
the linker that the object is not thread-safe. This means that the
'atomics' feature is available to accurately reflect whether or not an
object has atomics enabled.
As a drive-by tweak, this change also requires that bulk-memory be
enabled in addition to atomics in order to use shared memory. This is
because initializing shared memories requires bulk-memory operations.
Reviewers: aheejin, sbc100
Subscribers: dschuff, jgravelle-google, hiraditya, sunfish, jfb, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D79542
The code to prevent using `PPCXCOFFMCAsmInfo` with little-endian targets
used an incorrect check. Also, there does not appear to be sufficient
earlier checking to prevent failing this check, so the check here is
upgraded to be a `report_fatal_error`.
`PPCAIXAsmPrinter` was also missing a check against use with
little-endian targets. This patch adds such a check in.