At most one variant member of a union may have a default member
initializer. The case of anonymous records with multiple levels of
nesting like the following also needs to meet this rule. The original
logic is to horizontally obtain all the member variables in a record
that need to be initialized and then filter to the variables that need
to be fixed. Obviously, it is impossible to correctly initialize the
desired variables according to the nesting relationship.
See Example 3 in class.union
union U {
U() {}
int x; // int x{};
union {
int k; // int k{}; <== wrong fix
};
union {
int z; // int z{}; <== wrong fix
int y;
};
};
cxx_dr_status.html
I noticed that these two DRs are currently working correctly, so I
added a pair of lit tests that check the AST (which is most useful for
CWG1779, since 'dependent' is really only observable in an ast dump) to
make sure __func__ works correctly in dependent cases, and in lambda
operator().
Also noticed that CWG1762, mostly an 'example' change, works correctly,
so added a test so that it gets marked 'done' as well.
Additionally, I regenerated cxx_dr_status.html, updating it for Clang
13's release, based on the cwg_status.html from August 12, 2021.
Differential Revision: https://reviews.llvm.org/D109956
This patch marks splat immediate instructions XXSPLTIW and XXSPLTIDP as
rematerializable to prevent MachineLICM from moving them out of loops.
Reviewed By: lei, amy
Differential revision: https://reviews.llvm.org/D108823
It seems the crashes we saw wasn't caused by this (see comments on the review).
> This is basically D108837 but for jump threading. Free instructions
> should be ignored for the threading decision. JumpThreading already
> skips some free instructions (like pointer bitcasts), but does not
> skip various free intrinsics -- in fact, it currently gives them a
> fairly large cost of 2.
>
> Differential Revision: https://reviews.llvm.org/D110290
This reverts commit 4604695d7c.
With architected flat scratch it becomes readonly. We must always
reserve SGPR pair for it even if we do not use scratch at all since
an attempt to write to SGPRs mapped to FLAT_SCRATCH results in
memory violation.
This is not needed since GFX10 with architected flat scratch though
since special SGPRs are not carving space from normal SGPRs.
Differential Revision: https://reviews.llvm.org/D110376
Initially, the padding transformation and the related operation were only used
to guarantee static shapes of subtensors in tiled operations. The
transformation would not insert the padding operation if the shapes were
already static, and the overall code generation would actively remove such
"noop" pads. However, this transformation can be also used to pack data into
smaller tensors and marshall them into faster memory, regardless of the size
mismatches. In context of expert-driven transformation, we should assume that,
if padding is requested, a potentially padded tensor must be always created.
Update the transformation accordingly. To do this, introduce an optional
`packing` attribute to the `pad_tensor` op that serves as an indication that
the padding is an intentional choice (as opposed to side effect of type
normalization) and should be left alone by cleanups.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D110425
This patch adds range checking for some Power10 altivec builtins. Range
checking is done in SemaChecking.
Reviewed By: #powerpc, lei, Conanap
Differential Revision: https://reviews.llvm.org/D109780
This reverts the revert commit df56fc6ebb.
This version of the patch adjusts the location where the EarliestEscapes
cache is cleared when an instruction gets removed. The earliest escaping
instruction does not have to be a memory instruction.
It could be a ptrtoint instruction like in the added test
@earliest_escape_ptrtoint, which subsequently gets removed. We need to
invalidate the EarliestEscape entry referring to the ptrtoint when
deleting it.
This fixes the crash mentioned in
https://bugs.chromium.org/p/chromium/issues/detail?id=1252762#c6
As suggested on D108522, if we're sign extending a v4i16 source before multiplying as a v4i32, then we can replace that with a zero extension and rely on the implicit sign-extension of PMADDWD.
This enforces libcxx and its benchmarks are compiled by a C++20 capable
compiler. Based on review comments in D103413.
Differential Revision: https://reviews.llvm.org/D110338
This is motivated by the examples and discussion in:
https://llvm.org/PR51245
...and related bugs.
By using vector compares and vector logic, we can convert 2 'set'
instructions into 1 'movd' or 'movmsk' and generally improve
throughput/reduce instructions.
Unfortunately, we don't have a complete vector compare ISA before
AVX, so I left SSE-only out of this patch. Ie, we'd need extra logic
ops to simulate the missing predicates for SSE 'cmpp*', so it's not
as clearly a win.
Differential Revision: https://reviews.llvm.org/D110342
This reverts commit bb9333c350.
This exposes another existing bug that causes an infinite loop as shown in
D110170
...so reverting while I look at another fix.
The stress test does various assorted things
(memory accesses, function calls, atomic operations,
thread creation/join, intercepted libc calls)
in multiple threads just to stress various parts
of the runtime.
Reviewed By: melver
Differential Revision: https://reviews.llvm.org/D110416
It caused compiler crashes, see comment on the code review for repro.
> This is basically D108837 but for jump threading. Free instructions
> should be ignored for the threading decision. JumpThreading already
> skips some free instructions (like pointer bitcasts), but does not
> skip various free intrinsics -- in fact, it currently gives them a
> fairly large cost of 2.
>
> Differential Revision: https://reviews.llvm.org/D110290
This reverts commit 1e3c6fc7cb.
Add a test for __tsan_flush_memory() and for background
flushing of the runtime memory.
Reviewed By: melver
Differential Revision: https://reviews.llvm.org/D110409
In build_symbolizer.sh we can safely remove the -eu argument from the shebang (which is an unportable construct), as the scripts sets **-e** and **-u** already.
Differential Revision: https://reviews.llvm.org/D110039
Refactor Socket::DecodeHostAndPort() to use LLVM API over redundant
LLDB API. In particular, this means llvm::Regex, llvm::Error return
type and llvm::to_integer().
While at it, change the port type from int32_t to uint16_t. The method
never returns any value outside this range, and using the correct type
allows us to rely on getAsInteger()'s implicit overflow check.
Differential Revision: https://reviews.llvm.org/D110391
There are several places in the code that are currently broken where
we assume an Instruction is always a member of a BasicBlock that
lives in a Function. This is a problem specifically when
attempting to get the vscale_range attribute. This patch adds checks
that an Instruction's parent also has a parent!
I've added a test for a function-less @llvm.vscale intrinsic call here:
unittests/Analysis/ValueTrackingTest.cpp
Add support to create unique name for namelist group and be able to
deconstruct them.
This patch is part of the upstreaming effort from fir-dev branch.
Reviewed By: jeanPerier
Differential Revision: https://reviews.llvm.org/D110331
Co-authored-by: Jean Perier <jperier@nvidia.com>
Co-authored-by: Eric Schweitz <eschweitz@nvidia.com>
Refactor Socket::DecodeHostAndPort() to use LLVM API over redundant
LLDB API. In particular, this means llvm::Regex, llvm::Error return
type and llvm::to_integer().
While at it, change the port type from int32_t to uint16_t. The method
never returns any value outside this range, and using the correct type
allows us to rely on getAsInteger()'s implicit overflow check.
Differential Revision: https://reviews.llvm.org/D110391
In repairIntervalsInRange, if the new instructions refer to subregs but
the old instructions did not, make sure any existing live interval for
the superreg is updated to have subranges. Also skip repairing any range
that we have recalculated from scratch, partly for efficiency but also
to avoids some cases that repairOldRegInRange can't handle.
The existing test/CodeGen/AMDGPU/twoaddr-regsequence.mir provides some
test coverage for this change: when TwoAddressInstructionPass converts
REG_SEQUENCE into subreg copies, the live intervals will now get
subranges and MachineVerifier will verify that the subranges are
correct. Unfortunately MachineVerifier does not complain if the
subranges are not present, so the test also passed before this patch.
This patch also fixes ~800 of the ~1500 failures in the whole CodeGen
lit test suite when -early-live-intervals is forced on.
Differential Revision: https://reviews.llvm.org/D110328
The fix applied in D23303 "LiveIntervalAnalysis: fix a crash in repairOldRegInRange"
was over-zealous. It would bail out when the end of the range to be
repaired was in the middle of the first segment of the live range of
Reg, which was always the case when the range contained a single def of
Reg.
This patch fixes it as suggested by Matthias Braun in post-commit review
on the original patch, and tests it by adding -early-live-intervals to
a selection of existing lit tests that now pass.
(Note that D23303 was originally applied to fix a crash in
SILoadStoreOptimizer, but that is now moot since D23814 updated
SILoadStoreOptimizer to run before scheduling so it no longer has to
update live intervals.)
Differential Revision: https://reviews.llvm.org/D110238
Unrevert with some changes to the tests:
- Add -verify-machineinstrs to check for remaining problems in live
interval support in TwoAddressInstructionPass.
- Drop test/CodeGen/AMDGPU/extract-load-i1.ll since it suffers from
some of those remaining problems.
Add support for intersecting, subtracting, complementing and checking equality of sets having divisions.
Reviewed By: bondhugula
Differential Revision: https://reviews.llvm.org/D110138
This canonicalization pattern complements the tensor.cast(pad_tensor) one in
propagating constant type information when possible. It contributes to the
feasibility of pad hoisting.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D110343
When moving an entire basic block BB before InsertPoint, currently
we check for all instructions whether the operands dominates
InsertPoint, however, this can be improved such that even an
operand does not dominate InsertPoint, as long as it appears as
a previous instruction in the same BB, it is safe to move.
Reviewed By: Whitney
Differential Revision: https://reviews.llvm.org/D110378
Fixes issue found on greendragon buildbot, in which an incorrectly
indented statement following an if block led to entire frames being
dropped instead of simply filtering unneeded watches.
This reverts commit 1f44fa3ac1.
This reverts commit 0c2a454845.
This was my mistake. When gdb can find its data directory it'll
import it automatically. If it can't (like when you're using a
version from a build folder) you need to give it the data
directory path.
We're safe to assume gdb is installed for testing purposes
so it'll import it for us.
Add the tail policy argument to Clang builtins. There
are two policies for tail elements. Tail agnostic means users do not
care about the values in the tail elements and tail undisturbed means
the values in the tail elements need to be kept after the operation. In
order to let users control the tail policy, we add an additional
argument at the end of the argument list.
For unmasked operations, we have no maskedoff and the tail policy is
always tail agnostic. If users want to keep tail elements under unmasked
operations, they could use all one mask in the masked operations to do
it. So, we only add the additional argument for masked operations for
most cases. There are exceptions listed below.
In this patch, we do not handle the following cases to reduce the
complexity of the patch. There could be two separate patches for them.
Use dest argument to control tail policy
vmerge.vvm/vmerge.vxm/vmerge.vim (add _t builtins with additional dest
argument)
vfmerge.vfm (add _t builtins with additional dest argument)
vmv.v.v (add _t builtins with additional dest argument)
vmv.v.x (add _t builtins with additional dest argument)
vmv.v.i (add _t builtins with additional dest argument)
vfmv.v.f (add _t builtins with additional dest argument)
vadc.vvm/vadc.vxm/vadc.vim (add _t builtins with additional dest
argument)
vsbc.vvm/vsbc.vxm (add _t builtins with additional dest argument)
Always has tail argument for masked/unmasked intrinsics
Vector Single-Width Integer Multiply-Add Instructions (add _t and _mt
builtins)
Vector Widening Integer Multiply-Add Instructions (add _t and _mt
builtins)
Vector Single-Width Floating-Point Fused Multiply-Add Instructions (add
_t and _mt builtins)
Vector Widening Floating-Point Fused Multiply-Add Instructions (add _t
and _mt builtins)
Vector Reduction Operations (add _t and _mt builtins)
Vector Slideup Instructions (add _t and _mt builtins)
Vector Slidedown Instructions (add _t and _mt builtins)
Discussion: https://github.com/riscv/rvv-intrinsic-doc/pull/101
Differential Revision: https://reviews.llvm.org/D109322
Add the tail policy argument to LLVM IR intrinsics. There are two policies for tail elements. Tail agnostic means users do not care about the values in the tail elements and tail undisturbed means the values in the tail elements need to be kept after the operation. In order to let users control the tail policy, we add an additional argument at the end of the argument list.
For unmasked operations, we have no maskedoff and the tail policy is always tail agnostic. If users want to keep tail elements under unmasked operations, they could use all one mask in the masked operations to do it. So, we only add the additional argument for masked operations for most cases. There are exceptions listed below.
In this patch, we do not handle the following cases to reduce the complexity of the patch. There could be two separate patches for them.
* Use dest argument to control tail policy
vmerge.vvm/vmerge.vxm/vmerge.vim (add _t builtins with additional dest argument)
vfmerge.vfm (add _t builtins with additional dest argument)
vmv.v.v (add _t builtins with additional dest argument)
vmv.v.x (add _t builtins with additional dest argument)
vmv.v.i (add _t builtins with additional dest argument)
vfmv.v.f (add _t builtins with additional dest argument)
vadc.vvm/vadc.vxm/vadc.vim (add _t builtins with additional dest argument)
vsbc.vvm/vsbc.vxm (add _t builtins with additional dest argument)
* Always has tail argument for masked/unmasked intrinsics
Vector Single-Width Integer Multiply-Add Instructions (add _t and _mt builtins)
Vector Widening Integer Multiply-Add Instructions (add _t and _mt builtins)
Vector Single-Width Floating-Point Fused Multiply-Add Instructions (add _t and _mt builtins)
Vector Widening Floating-Point Fused Multiply-Add Instructions (add _t and _mt builtins)
Vector Reduction Operations (add _t and _mt builtins)
Vector Slideup Instructions (add _t and _mt builtins)
Vector Slidedown Instructions (add _t and _mt builtins)
Discussion: https://github.com/riscv/rvv-intrinsic-doc/pull/101
Differential Revision: https://reviews.llvm.org/D105092
Testing on a SLM box suggests these can run on either port, but the throughput is 4cy on either (inc MMX versions). Confirmed with Intel AoM / Agner / InstLatX64.
There are several places in the code that are currently broken as
they assume an Instruction always has a parent Function when
attempting to get the vscale_range attribute. This patch adds checks
that an Instruction has a parent.
I've added a test for a parentless @llvm.vscale intrinsic call here:
unittests/Analysis/ValueTrackingTest.cpp
Differential Revision: https://reviews.llvm.org/D110158
This change is to keep the help text and command guide of objcopy in
tandem.
- In the help output the options --rename-section and
--set-section-flags were missing the flag exclude, which is found in
the command guide.
- In the command guide the alias -G for --keep-global-symbol was
missing, which is found in the help output.
Differential Revision: https://reviews.llvm.org/D110340