Summary:
Adds the constraints described below to ensure that we
can tie variables of SVE ACLE types to operands in inline-asm:
- y: SVE registers Z0-Z7
- Upl: One of the low eight SVE predicate registers (P0-P7)
- Upa: Full range of SVE predicate registers (P0-P15)
Reviewers: sdesmalen, huntergr, rovka, cameron.mcinally, efriedma, rengolin
Reviewed By: efriedma
Subscribers: miyuki, tschuett, rkruppe, psnobl, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D75690
Summary:
Right now we annotate C++'s `operator new` with `noalias` attribute,
which very much is healthy for optimizations.
However as per [[ http://eel.is/c++draft/basic.stc.dynamic.allocation | `[basic.stc.dynamic.allocation]` ]],
there are more promises on global `operator new`, namely:
* non-`std::nothrow_t` `operator new` *never* returns `nullptr`
* If `std::align_val_t align` parameter is taken, the pointer will also be `align`-aligned
* ~~global `operator new`-returned pointer is `__STDCPP_DEFAULT_NEW_ALIGNMENT__`-aligned ~~ It's more caveated than that.
Supplying this information may not cause immediate landslide effects
on any specific benchmarks, but it for sure will be healthy for optimizer
in the sense that the IR will better reflect the guarantees provided in the source code.
The caveat is `-fno-assume-sane-operator-new`, which currently prevents emitting `noalias`
attribute, and is automatically passed by Sanitizers ([[ https://bugs.llvm.org/show_bug.cgi?id=16386 | PR16386 ]]) - should it also cover these attributes?
The problem is that the flag is back-end-specific, as seen in `test/Modules/explicit-build-flags.cpp`.
But while it is okay to add `noalias` metadata in backend, we really should be adding at least
the alignment metadata to the AST, since that allows us to perform sema checks on it.
Reviewers: erichkeane, rjmccall, jdoerfert, eugenis, rsmith
Reviewed By: rsmith
Subscribers: xbolva00, jrtc27, atanasyan, nlopes, cfe-commits
Tags: #llvm, #clang
Differential Revision: https://reviews.llvm.org/D73380
Summary:
As @rsmith notes in https://reviews.llvm.org/D73020#inline-672219
while that is certainly UB land, it may not be actually reachable at runtime, e.g.:
```
template<int N> void *make() {
if ((N & (N-1)) == 0)
return operator new(N, std::align_val_t(N));
else
return operator new(N);
}
void *p = make<7>();
```
and we shouldn't really error-out there.
That being said, i'm not really following the logic here.
Which ones of these cases should remain being an error?
Reviewers: rsmith, erichkeane
Reviewed By: erichkeane
Subscribers: cfe-commits, rsmith
Tags: #clang
Differential Revision: https://reviews.llvm.org/D73996
This brings back 2af74e27ed and reverts
eaabaf7e04.
The changes were correct, the code that was broken contained an ODR
violation that assumed that these types are passed equivalently:
struct alignas(uint64_t) Wrapper { uint64_t P };
void f(uint64_t p);
void f(Wrapper p);
MSVC does not pass them the same way, and so clang-cl should not pass
them the same way either.
Null-check and adjut a TypeLoc before casting it to a FunctionTypeLoc.
This fixes a crash in -fsanitize=nullability-return, and also makes the
location of the nonnull type available when the return type is adjusted.
rdar://59263039
Differential Revision: https://reviews.llvm.org/D74355
These temporaries are only used in the callee, and their memory can be reused
after the call is complete.
rdar://58552124
Differential revision: https://reviews.llvm.org/D74094
If the return block is unreachable, clang removes it in
CodeGenFunction::FinishFunction(). This removal can leave dangling
references to values defined in the return block if the return block has
successors, which it /would/ if UBSan's return value check is emitted.
In this case, as the UBSan check wouldn't be reachable, it's better to
simply not emit it.
rdar://59196131
AMDGPU and x86 at least both have separate controls for whether
denormal results are flushed on output, and for whether denormals are
implicitly treated as 0 as an input. The current DAGCombiner use only
really cares about the input treatment of denormals.
When T is a class type, only nvsize(T) bytes need be accessible through
the reference. We had matching bugs in the application of the
dereferenceable attribute and in -fsanitize=undefined.
When using -fno-builtin[-<name>], we don't attach the IR attributes to
function definitions with no Decl, like the ones created through
`CreateGlobalInitOrDestructFunction`.
This results in projects using -fno-builtin or -ffreestanding to start
seeing symbols like _memset_pattern16.
The fix changes the behavior to always add the attribute if LangOptions
requests it.
Differential Revision: https://reviews.llvm.org/D73495
MSVC 2013 would refuse to pass highly aligned things (typically vectors
and aggregates) by value. Users would receive this error:
t.cpp(11) : error C2719: 'w': formal parameter with __declspec(align('32')) won't be aligned
t.cpp(11) : error C2719: 'q': formal parameter with __declspec(align('32')) won't be aligned
However, in MSVC 2015, this behavior was changed, and highly aligned
things are now passed indirectly. To avoid breaking backwards
incompatibility, objects that do not have a *required* high alignment
(i.e. double) are still passed directly, even though they are not
naturally aligned. This change implements the new behavior of passing
things indirectly.
The new behavior is:
- up to three vector parameters can be passed in [XYZ]MM0-2
- remaining arguments with required alignment greater than 4 bytes are
passed indirectly
Previously, MSVC never passed things truly indirectly, meaning clang
would always apply the byval attribute to indirect arguments. We had to
go to the trouble of adding inalloca so that non-trivially copyable C++
types could be passed in place without copying the object
representation. When inalloca was added, we asserted that all arguments
passed indirectly must use byval. With this change, that assert no
longer holds, and I had to update inalloca to handle that case. The
implicit sret pointer parameter was already handled this way, and this
change generalizes some of that logic to arguments.
There are two cases that this change leaves unfixed:
1. objects that are non-trivially copyable *and* overaligned
2. vectorcall + inalloca + vectors
For case 1, I need to touch C++ ABI code in MicrosoftCXXABI.cpp, so I
want to do it in a follow-up.
For case 2, my fix is one line, but it will require updating IR tests to
use lots of inreg, so I wanted to separate it out.
Related to D71915 and D72110
Fixes most of PR44395
Reviewed By: rjmccall, craig.topper, erichkeane
Differential Revision: https://reviews.llvm.org/D72114
Summary:
Much like with the previous patch (D73005) with `AssumeAlignedAttr`
handling, results in mildly more readable IR,
and will improve test coverage in upcoming patch.
Note that in `AllocAlignAttr`'s case, there is no requirement
for that alignment parameter to end up being an I-C-E.
Reviewers: erichkeane, jdoerfert, hfinkel, aaron.ballman, rsmith
Reviewed By: erichkeane
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D73006
Summary:
This should be mostly NFC - we still lower the same alignment
knowledge to the IR. The main reasoning here is that
this somewhat improves readability of IR like this,
and will improve test coverage in upcoming patch.
Even though the alignment is guaranteed to always be an I-C-E,
we don't always materialize it as llvm's Alignment Attribute because:
1. There may be a non-zero offset
2. We may be sanitizing for alignment
Note that if there already was an IR alignment attribute
on return value, we union them, and thus the alignment
only ever rises.
Also, there is a second relevant clang attribute `AllocAlignAttr`,
so that is why `AbstractAssumeAlignedAttrEmitter` is templated.
Reviewers: erichkeane, jdoerfert, hfinkel, aaron.ballman, rsmith
Reviewed By: erichkeane
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D73005
Summary: Just an NFC code cleanup i stumbled upon when stumbling through clang alignment attribute handling.
Reviewers: erichkeane, gchatelet, courbet, jdoerfert
Reviewed By: gchatelet
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D72993
Summary:
We shouldn't be just giving up if we find one of them
(like we currently do with `AssumeAlignedAttr`),
we should emit them all.
As the tests show, even if we materialized good knowledge
from `__attribute__((assume_aligned(32)`, it doesn't mean
`__attribute__((alloc_align([...])))` info won't be useful.
It might be, but that isn't given.
Reviewers: erichkeane, jdoerfert, aaron.ballman
Reviewed By: erichkeane
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D72979
Currently there are 4 different mechanisms for controlling denormal
flushing behavior, and about as many equivalent frontend controls.
- AMDGPU uses the fp32-denormals and fp64-f16-denormals subtarget features
- NVPTX uses the nvptx-f32ftz attribute
- ARM directly uses the denormal-fp-math attribute
- Other targets indirectly use denormal-fp-math in one DAGCombine
- cl-denorms-are-zero has a corresponding denorms-are-zero attribute
AMDGPU wants a distinct control for f32 flushing from f16/f64, and as
far as I can tell the same is true for NVPTX (based on the attribute
name).
Work on consolidating these into the denormal-fp-math attribute, and a
new type specific denormal-fp-math-f32 variant. Only ARM seems to
support the two different flush modes, so this is overkill for the
other use cases. Ideally we would error on the unsupported
positive-zero mode on other targets from somewhere.
Move the logic for selecting the flush mode into the compiler driver,
instead of handling it in cc1. denormal-fp-math/denormal-fp-math-f32
are now both cc1 flags, but denormal-fp-math-f32 is not yet exposed as
a user flag.
-cl-denorms-are-zero, -fcuda-flush-denormals-to-zero and
-fno-cuda-flush-denormals-to-zero will be mapped to
-fp-denormal-math-f32=ieee or preserve-sign rather than the old
attributes.
Stop emitting the denorms-are-zero attribute for the OpenCL flag. It
has no in-tree users. The meaning would also be target dependent, such
as the AMDGPU choice to treat this as only meaning allow flushing of
f32 and not f16 or f64. The naming is also potentially confusing,
since DAZ in other contexts refers to instructions implicitly treating
input denormals as zero, not necessarily flushing output denormals to
zero.
This also does not attempt to change the behavior for the current
attribute. The LangRef now states that the default is ieee behavior,
but this is inaccurate for the current implementation. The clang
handling is slightly hacky to avoid touching the existing
denormal-fp-math uses. Fixing this will be left for a future patch.
AMDGPU is still using the subtarget feature to control the denormal
mode, but the new attribute are now emitted. A future change will
switch this and remove the subtarget features.
Summary:
Avoid using the `nocf_check` attribute with Control Flow Guard. Instead, use a
new `"guard_nocf"` function attribute to indicate that checks should not be
added on indirect calls within that function. Add support for
`__declspec(guard(nocf))` following the same syntax as MSVC.
Reviewers: rnk, dmajor, pcc, hans, aaron.ballman
Reviewed By: aaron.ballman
Subscribers: aaron.ballman, tomrittervg, hiraditya, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D72167
Summary:
This is a follow up on https://reviews.llvm.org/D61634#1742154 to turn the clang driver -fno-builtin flag into an IR attribute.
I also investigated pushing the attribute earlier on (in Sema) but it looks like this patch is simple and will cover all function calls.
Reviewers: aaron.ballman, courbet
Subscribers: cfe-commits, tejohnson
Tags: #clang
Differential Revision: https://reviews.llvm.org/D71193
Patch was reverted because https://bugs.llvm.org/show_bug.cgi?id=44048
The original patch is modified to set the strictfp IR attribute
explicitly in CodeGen instead of as a side effect of IRBuilder.
In the 2nd attempt to reapply there was a windows lit test fail, the
tests were fixed to use wildcard matching.
Differential Revision: https://reviews.llvm.org/D62731
ExpandTypeFromArgs
This fixes a bug in IRGen where a call to `llvm.objc.storeStrong` was
being emitted to initialize a __strong field of an uninitialized
temporary struct, which caused crashes at runtime.
rdar://problem/51807365
AggValueSlot
This reapplies 8a5b7c3570 after a null
dereference bug in CGOpenMPRuntime::emitUserDefinedMapper.
Original commit message:
This is needed for the pointer authentication work we plan to do in the
near future.
a63a81bd99/clang/docs/PointerAuthentication.rst
Cleanup handling of the denormal-fp-math attribute. Consolidate places
checking the allowed names in one place.
This is in preparation for introducing FP type specific variants of
the denormal-fp-mode attribute. AMDGPU will switch to using this in
place of the current hacky use of subtarget features for the denormal
mode.
Introduce a new header for dealing with FP modes. The constrained
intrinsic classes define related enums that should also be moved into
this header for uses in other contexts.
The verifier could use a check to make sure the denorm-fp-mode
attribute is sane, but there currently isn't one.
Currently, DAGCombiner incorrectly asssumes non-IEEE behavior by
default in the one current user. Clang must be taught to start
emitting this attribute by default to avoid regressions when this is
switched to assume ieee behavior if the attribute isn't present.
Summary:
This is a follow up on https://reviews.llvm.org/D61634
This patch is simpler and only adds the no_builtin attribute.
Reviewers: tejohnson, courbet, theraven, t.p.northover, jdoerfert
Subscribers: mgrang, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D68028
This is a re-submit after it got reverted in https://reviews.llvm.org/rGbd8791610948 since the breakage doesn't seem to come from this patch.
Summary:
This is a follow up on https://reviews.llvm.org/D61634
This patch is simpler and only adds the no_builtin attribute.
Reviewers: tejohnson, courbet, theraven, t.p.northover, jdoerfert
Subscribers: mgrang, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D68028
The behavior from the original patch has changed, since we're no longer
allowing LLVM to just ignore the alignment. Instead, we're just
assuming the maximum possible alignment.
Differential Revision: https://reviews.llvm.org/D68824
llvm-svn: 374562
The test fails on Windows, with
error: 'warning' diagnostics expected but not seen:
File builtin-assume-aligned.c Line 62: requested alignment
must be 268435456 bytes or smaller; assumption ignored
error: 'warning' diagnostics seen but not expected:
File builtin-assume-aligned.c Line 62: requested alignment
must be 8192 bytes or smaller; assumption ignored
llvm-svn: 374456
Code to handle __builtin_assume_aligned was allowing larger values, but
would convert this to unsigned along the way. This patch removes the
EmitAssumeAligned overloads that take unsigned to do away with this
problem.
Additionally, it adds a warning that values greater than 1 <<29 are
ignored by LLVM.
Differential Revision: https://reviews.llvm.org/D68824
llvm-svn: 374450
When passing arguments using temporary allocas, we need to add the
appropriate lifetime markers so that the stack coloring passes can
re-use the stack space.
This patch keeps track of all the lifetime.start calls emited before the
codegened call, and adds the corresponding lifetime.end calls after the
call.
Differential Revision: https://reviews.llvm.org/D68611
llvm-svn: 374126
* Adds a TypeSize struct to represent the known minimum size of a type
along with a flag to indicate that the runtime size is a integer multiple
of that size
* Converts existing size query functions from Type.h and DataLayout.h to
return a TypeSize result
* Adds convenience methods (including a transparent conversion operator
to uint64_t) so that most existing code 'just works' as if the return
values were still scalars.
* Uses the new size queries along with ElementCount to ensure that all
supported instructions used with scalable vectors can be constructed
in IR.
Reviewers: hfinkel, lattner, rkruppe, greened, rovka, rengolin, sdesmalen
Reviewed By: rovka, sdesmalen
Differential Revision: https://reviews.llvm.org/D53137
llvm-svn: 374042
The static analyzer is warning about potential null dereferences, but in these cases we should be able to use castAs<RecordType> directly and if not assert will fire for us.
llvm-svn: 373584