Check the address space first before searching for the object
definition to save compile time. As an added bonus, this will now
treat casts to constant addrspace as constant.
We also seemed to be missing targeted tests for this, so add a few
missing other cases too.
Helps improve test coverage of the XOP modes in X86TargetLowering::isVectorShiftByScalarCheap (and where we always return false for vXi8 vector shifts).
This enables the register to be changed from XMM/YMM/ZMM0 to
instead match the other source. This prevents a false
dependency.
I added all the integer unpck instructions, but the tests
only show changes for BW and WD.
Unfortunately, we can have undef on operand 1 or 2 of the AVX
instructions. This breaks the interface with hasUndefRegUpdate
which used to tell which operand to check.
Now we scan the input operands looking for an undef register and
then ask hasUndefRegUpdate if its an instruction we care about
and which operands of that instruction we care about.
I also had to make some changes to the load folding code to
always pass operand 1 to hasUndefRegUpdate. I've updated
hasUndefRegUpdate to return false when ForLoadFold is set for
instructions that are not explicitly blocked for load folding in
isel patterns.
Differential Revision: https://reviews.llvm.org/D79615
Unlike Neon, MVE does not have a way of duplicating from a vector lane,
so a VDUPLANE currently selects to a VDUP(move_from_lane(..)). This
forces that to be done earlier as a dag combine to allow other folds to
happen.
It converts to a VDUP(EXTRACT). On FP16 this is then folded to a
VGETLANEu to prevent it from creating a vmovx;vmovhr pair, using a
single move_from_reg instead.
Differential Revision: https://reviews.llvm.org/D79606
MachProcess.mm uses a TARGET_OS_ macro without directly including
TargetConditionals.h. This currently works as we get the header
as an indirect dependency, but might not in the future.
I just spent some time investigating an internal regression
caused by a similar issue, so I audited the codebase for such
cases.
When intrinsic types are assigned there are some implicit conversions
that take place. This change make them explicit in the types
representation of assignments.
Differential Revision: https://reviews.llvm.org/D79637
Summary:
Sometimes in templated code Member references are reported as `DependentScopeMemberExpr` because that's what the standard dictates, however in many trivial cases it is easy to resolve the reference to its actual Member.
Take this code:
```
template<typename T>
class A{
int value;
A& operator=(const A& Other){
value = Other.value;
this->value = Other.value;
return *this;
}
};
```
When ran with `clang-tidy file.cpp -checks=readability-identifier-naming --config="{CheckOptions: [{key: readability-identifier-naming.MemberPrefix, value: m_}]}" -fix`
Current behaviour:
```
template<typename T>
class A{
int m_value;
A& operator=(const A& Other){
m_value = Other.value;
this->value = Other.value;
return *this;
}
};
```
As `this->value` and `Other.value` are Dependent they are ignored when creating the fix-its, however this can easily be resolved.
Proposed behaviour:
```
template<typename T>
class A{
int m_value;
A& operator=(const A& Other){
m_value = Other.m_value;
this->m_value = Other.m_value;
return *this;
}
};
```
Reviewers: aaron.ballman, JonasToth, alexfh, hokein, gribozavr2
Reviewed By: aaron.ballman
Subscribers: merge_guards_bot, xazax.hun, cfe-commits
Tags: #clang, #clang-tools-extra
Differential Revision: https://reviews.llvm.org/D73052
The challenge with measuring time in tests is that slow and/or busy
machines can cause tests to fail in unexpected ways. After this change,
three tests should be much more robust. The only remaining and tiny race
that I can think of is preemption after `--countDown`. That being said,
the race isn't fixable because the standard library doesn't provide a
way to count threads that are waiting to acquire a lock.
Reviewers: ldionne, EricWF, howard.hinnant, mclow.lists, #libc
Reviewed By: ldionne, #libc
Subscribers: dexonsmith, jfb, broadwaylamb, libcxx-commits
Tags: #libc
Differential Revision: https://reviews.llvm.org/D79406
For the sint_to_fp(and(X,C)) -> and(X,sint_to_fp(C)) fold, allow combineVectorCompareAndMaskUnaryOp to match any X that ComputeNumSignBits says is all-bits, not just SETCC.
Noticed while investigating mask promotion issues in PR45808
For the sint_to_fp(and(X,C)) -> and(X,sint_to_fp(C)) fold, combineVectorCompareAndMaskUnaryOp only matches X against SETCC (with an all-bits result) when really it could accept anything that ComputeNumSignBits says is all-bits.
Noticed while investigating mask promotion issues in PR45808
D55859 and D63339 prevented needless dependencies on system symbol
files. This testcase was checked-in afterwards and it brings back one
such unwanted dependency. Under some circumstances it may cause false
FAILs and/or excessive resource usage to run the testcase.
clang-format does not support .py so I have formatted it as I found most
compatible.
Also this is not a full testcase-style initialization, for example
--no-lldbinit ignores env("NO_LLDBINIT") setting which lldbtest.py does
implement:
# If we spawn an lldb process for test (via pexpect), do not load the
# init file unless told otherwise.
if os.environ.get("NO_LLDBINIT") != "NO":
self.lldbOption += " --no-lldbinit"
But this is what lldbpexpect.py does - it also ignores
env("NO_LLDBINIT"). Sure one could also fix lldbpexpect.py to unify the
initialization more with lldbtest.py but I find that outside of the
scope of this patch.
Differential Revision: https://reviews.llvm.org/D79649
It looks like that was an initial intention, but some code paths in
`DWARFExpression::Operation::extract()` did not initialize `EndOffset`
properly.
Differential Revision: https://reviews.llvm.org/D79622
Previously we implemented non-standard disambiguation rules to
distinguish an enum-base from a bit-field but otherwise treated a :
after an elaborated-enum-specifier as introducing an enum-base. That
misparses various examples (anywhere an elaborated-type-specifier can
appear followed by a colon, such as within a ternary operator or
_Generic).
We now implement the C++11 rules, with the old cases accepted as
extensions where that seemed reasonable. These amount to:
* an enum-base must always be accompanied by an enum definition (except
in a standalone declaration of the form 'enum E : T;')
* in a member-declaration, 'enum E :' always introduces an enum-base,
never a bit-field
* in a type-specifier (or similar context), 'enum E :' is not
permitted; the colon means whatever else it would mean in that
context.
Fixed underlying types for enums are also permitted in Objective-C and
under MS extensions, plus as a language extension in all other modes.
The behavior in ObjC and MS extensions modes is unchanged (but the
bit-field disambiguation is a bit better); remaining language modes
follow the C++11 rules.
Fixes PR45726, PR39979, PR19810, PR44941, and most of PR24297, plus C++
core issues 1514 and 1966.
Linkage type was only referenced for functions, not for global
variables.
Clarify that LLVM doesn't make assumption about the allocation size when
no definitive initializer for a global variable is known.
Differential Revision: https://reviews.llvm.org/D78952
This patch stores the alignment for ConstantPoolSDNode as an
Align and updates the getConstantPool interface to take a MaybeAlign.
Removing getAlignment() will be done as a follow up.
Differential Revision: https://reviews.llvm.org/D79436
This is mostly useful if alloca element type is not integer
and then casted to an integer for load or store. We now can
vectorize an [i32] alloca but cannot do so for [float].
There also a separate patch needed to properly lower 64 bit
types after they vectorized. At the moment these are lowered
via scratch anyway.
Differential Revision: https://reviews.llvm.org/D79641
Separate functions that require shared state into a class to avoid
needing to pass them though multiple functions just to be available
where needed.
The main motivation for this is that we would like to remove the
limitation that accumulator values be dynamic constant, which would
require additional shared state between call eliminations in the same
function, compounding this issue.
Differential Revision: https://reviews.llvm.org/D79299
I noticed that std::error_code() does one-time initialization. Avoid
that overhead with Expected<T> and llvm::Error. Also, it is consistent
with the virtual interface and ELF, and generally cleaner.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D79643
Summary:
Since the underlying wait and notify instructions are only available
when the atomics feature is enabled, it only makes sense to expose
their builtin functions when atomics are enabled.
Reviewers: aheejin, sunfish
Subscribers: dschuff, sbc100, jgravelle-google, jfb, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D79534
Summary:
The WebAssembly backend automatically lowers atomic operations and TLS
to nonatomic operations and non-TLS data when either are present and
the atomics or bulk-memory features are not present, respectively. The
resulting object is no longer thread-safe, so the linker has to be
told not to allow it to be linked into a module with shared
memory. This was previously done by disallowing the 'atomics' feature,
which prevented any objct with its atomic operations or TLS removed
from being linked with any object containing atomics or TLS, and
therefore preventing it from being linked into a module with shared
memory since shared memory requires atomics.
However, as of https://github.com/WebAssembly/threads/issues/144, the
validation rules are relaxed to allow atomic operations to validate
with unshared memories, which makes it perfectly safe to link an
object with stripped atomics and TLS with another object that still
contains TLS and atomics as long as the resulting module has an
unshared memory. To allow this kind of link, this patch disallows a
pseudo-feature 'shared-mem' rather than 'atomics' to communicate to
the linker that the object is not thread-safe. This means that the
'atomics' feature is available to accurately reflect whether or not an
object has atomics enabled.
As a drive-by tweak, this change also requires that bulk-memory be
enabled in addition to atomics in order to use shared memory. This is
because initializing shared memories requires bulk-memory operations.
Reviewers: aheejin, sbc100
Subscribers: dschuff, jgravelle-google, hiraditya, sunfish, jfb, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D79542
The code to prevent using `PPCXCOFFMCAsmInfo` with little-endian targets
used an incorrect check. Also, there does not appear to be sufficient
earlier checking to prevent failing this check, so the check here is
upgraded to be a `report_fatal_error`.
`PPCAIXAsmPrinter` was also missing a check against use with
little-endian targets. This patch adds such a check in.
Summary:
This patch introduces the constants defined to identify DWARF sections
in XCOFF into `llvm/BinaryFormat/XCOFF.h` and adds (NFC) placeholder
code to `llvm/lib/MC/MCObjectFileInfo.cpp` where the DWARF sections for
XCOFF are to be set up.
Reviewers: jasonliu, sfertile, daltenty, DiggerLin, Xiangling_L
Reviewed By: jasonliu, sfertile, DiggerLin
Differential Revision: https://reviews.llvm.org/D79220
Summary:
`AsmPrinter::emitGlobalIndirectSymbol` is dependent on
`MCStreamer::emitAssignment` to produce `.set` directives for alias
symbols; however, the `.set` pseudo-op on AIX is documented as not
usable with external relocatable terms or expressions, which limits its
applicability in generating alias symbols.
Disable generating aliases on AIX until a different implementation
strategy is available.
Reviewers: cebowleratibm, jasonliu, sfertile, daltenty, DiggerLin
Reviewed By: jasonliu
Differential Revision: https://reviews.llvm.org/D79044
This fixes a verifier failure on a bot:
http://green.lab.llvm.org/green/job/test-suite-verify-machineinstrs-aarch64-O0-g/
```
*** Bad machine code: MBB has duplicate entries in its successor list. ***
- function: foo
- basic block: %bb.5 indirectgoto (0x7fe3d687ca08)
```
One of the GCC torture suite tests (pr70460.c) has an indirectbr instruction
which has duplicate blocks in its destination list.
According to the langref this is allowed:
> Blocks are allowed to occur multiple times in the destination list, though
> this isn’t particularly useful.
(https://www.llvm.org/docs/LangRef.html#indirectbr-instruction)
We don't allow this in MIR. So, when we translate such an instruction, the
verifier screams.
This patch makes `translateIndirectBr` check if a successor has already been
added to a block. If the successor is present, it is skipped rather than added
twice.
Differential Revision: https://reviews.llvm.org/D79609