Use the zx_clock_get_monotonic system call directly rather than
going through the POSIX clock_gettime function. The libc function
is a trivial wrapper around the system call, and is not a standard C
function. Avoiding it reduces the Fuchsia libc ABI surface that
libc++ depends on.
Reviewed By: phosek, ldionne, #libc
Differential Revision: https://reviews.llvm.org/D116606
This patch simplifies the interface between RAGreedy and the eviction
adviser by passing the allocator to the adviser, which allows the latter
to extract needed information as needed, rather than requiring it be passed
piecemeal at construction time (which would also complicate later
evolution).
Part of this, the patch also moves ExtraRegInfo back to RAGreedy. We
keep the encapsulation of ExtraRegInfo because it has benefits (e.g.
improved readability by abstracting access to the cascade info) and also
simpler re-initialization at regalloc pass re-entry time (we just flush
the Optional).
Differential Revision: https://reviews.llvm.org/D116669
I recently had an email exchange on flang-dev that revealed that the
documentation on how to build flang is incorrect. This update fixes
that.
Differential Revision: https://reviews.llvm.org/D116566
This is part of the implementation of the dataflow analysis framework.
See "[RFC] A dataflow analysis framework for Clang AST" on cfe-dev.
Differential Revision: https://reviews.llvm.org/D116596
We were trying to guess at the original IR type for image intrinsics
after legalization to figure out if they were d16, but this didn't
work. Explicitly track if this is a d16 operation or not in the
opcode, as is done for the buffer intrinsics.
The OpenCL library is using f32 image writes with a dmask of 15 for
some reason, and this was incorrectly switching them to use d16. Fixes
image failures in the OpenCL conformance test. The equivalent dmask
for loads doesn't even select in either selector.
This diff enables users to override CMAKE_C_ARCHIVE_CREATE & CMAKE_CXX_ARCHIVE_CREATE
(currently set in HandleLLVMOptions.cmake).
For example, one can specify
cmake -DCMAKE_C_ARCHIVE_CREATE="<CMAKE_AR> TDqc <TARGET> <LINK_FLAGS> <OBJECTS>" \
-DCMAKE_CXX_ARCHIVE_CREATE="<CMAKE_AR> TDqc <TARGET> <LINK_FLAGS> <OBJECTS>" ...
to make the build create thin archives instead of regular ones.
For a clean run `ninja lld` using thin archives seems to reduce the size
of the build directory from ~14GB to ~8GB
Differential revision: https://reviews.llvm.org/D116850
If we know the source is a valid object, we do not need to insert a
null check. This misses a lot of opportunities from
metadata/attributes not tracked in codegen.
Currently, something like `print *, size(foo(n,m))` was rewritten
to `print *, size(foo_result_symbol)` when foo result is a non constant
shape array. This cannot be processed by lowering or reprocessed by a
Fortran compiler since the syntax is wrong (`foo_result_symbol` is
unknown on the caller side) and the arguments are lost when they might
be required to compute the result shape.
It is not possible (and probably not desired) to make GetShape fail in
general in such case since returning nullopt seems only expected for
scalars or assumed rank (see GetRank usage in lib/Semantics/check-call.cpp),
and returning a vector with nullopt extent may trigger some checks to
believe they are facing an assumed size (like here in intrinsic argument
checks: 196204c72c/flang/lib/Evaluate/intrinsics.cpp (L1530)).
Hence, I went for a solution that limits the rewrite change to folding
(where the original expression is returned if the shape depends on a non
constant shape from a call).
I added a non default option to GetShapeHelper that prevents the rewrite
of shape inquiry on calls to descriptor inquiries. At first I wanted to
avoid touching GetShapeHelper, but it would require to re-implement all
its logic to determine if the shape comes from a function call or not
(the expression could be `size(1+foo(n,m))`). So added an alternate
entry point to GetShapeHelper seemed the cleanest solution to me.
Differential Revision: https://reviews.llvm.org/D116933
We only support both TLSDESC and TLS GD for x86 so this is an x86-specific
problem. If both are used, only one R_X86_64_TLSDESC is produced and TLS GD
accesses will incorrectly reference R_X86_64_TLSDESC. Fix this by introducing
SymbolAux::tlsDescIdx.
Reviewed By: ikudrin
Differential Revision: https://reviews.llvm.org/D116900
As discussed in https://github.com/llvm/llvm-project/issues/53020 / https://reviews.llvm.org/D116692,
SCEV is forbidden from reasoning about 'backedge taken count'
if the branch condition is a poison-safe logical operation,
which is conservatively correct, but is severely limiting.
Instead, we should have a way to express those
poison blocking properties in SCEV expressions.
The proposed semantics is:
```
Sequential/in-order min/max SCEV expressions are non-commutative variants
of commutative min/max SCEV expressions. If none of their operands
are poison, then they are functionally equivalent, otherwise,
if the operand that represents the saturation point* of given expression,
comes before the first poison operand, then the whole expression is not poison,
but is said saturation point.
```
* saturation point - the maximal/minimal possible integer value for the given type
The lowering is straight-forward:
```
compare each operand to the saturation point,
perform sequential in-order logical-or (poison-safe!) ordered reduction
over those checks, and if reduction returned true then return
saturation point else return the naive min/max reduction over the operands
```
https://alive2.llvm.org/ce/z/Q7jxvH (2 ops)
https://alive2.llvm.org/ce/z/QCRrhk (3 ops)
Note that we don't need to check the last operand: https://alive2.llvm.org/ce/z/abvHQS
Note that this is not commutative: https://alive2.llvm.org/ce/z/FK9e97
That allows us to handle the patterns in question.
Reviewed By: nikic, reames
Differential Revision: https://reviews.llvm.org/D116766
Before this patch, the user needed to specialize both of
`is_placeholder<MyType>` and `is_placeholder<const MyType>`.
After this patch, only the former is needed (although the
latter is harmless if provided).
The new tests don't actually fail unless return type deduction
is used, which is a C++14 feature. Specializing `is_placeholder`
is still allowed in C++11, though.
Fixes#51095.
Differential Revision: https://reviews.llvm.org/D116388
Clang will now search through the framework includes to identify
the framework include path to a file, and then suggest a framework
style include spelling for the file.
Differential Revision: https://reviews.llvm.org/D115183
This reverts commit 80e2c58749.
The original patch causes a lot of warnings on gcc like:
llvm-project/clang/include/clang/Basic/Diagnostic.h:1329:3: warning:
base class ‘class clang::StreamingDiagnostic’ should be explicitly
initialized in the copy constructor [-Wextra]
(Split from original patch to separate non-NFC part and add coverage. I typoed when adding the new test, so this change includes the typo fix to let libfunc recongize the signature. Didn't figure it was worth another separate commit.)
Differential Revision: https://reviews.llvm.org/D116851 (part 2 of 2)
There are a few places where the alignment argument for AlignedAllocLike functions was previously hardcoded. This patch adds an getAllocAlignment function and a change to the MemoryBuiltin table to allow alignment arguments to be found generically.
This will shortly allow alignment inference on operator new's with align_val params and an extension to Attributor's HeapToStack. The former will follow shortly - I split Bryce's patch for purpose of having the large change be NFC. The later will be reviewed separately.
Differential Revision: https://reviews.llvm.org/D116851 (part 1 of 2)
These tests are interested in the FP instructions being used, not
the conversions needed to pass the arguments/returns in GPRs.
Reviewed By: asb
Differential Revision: https://reviews.llvm.org/D116869
Given a select whose result is an i1, we can eliminate the conditional in the select completely by adding a few arithmetic operations.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D116839
This change removes a previous restriction where we had to prove the allocation performed by aligned_alloc was non-zero in size before using the align parameter to annotate the result. I believe this was conservatism around the C11 specification of this routine which allowed UB when size was not a multiple of alignment, but if so, it was a partial one at best. (ex: align 32, size 16 was equally UB, but not restricted) The spec has since been clarified to require nullptr return, not UB.
A nullptr - the documented return for this function on failure for all cases after UB mentioned above was removed - is trivially aligned for any power of two. This isn't totally new behavior even for this transform, we'd previously annotate potentially failing allocs (e.g. huge sizes) meaning we were putting align on potentially null pointers anyways. This change simpy does the same for all failure modes.
DexUnreachable is a useful tool for specifying that lines shouldn't be
stepped on. Right now they have to be placed in the source file; lets allow
them to be placed instead in a detached .dex file, by adding on_line and
line-range keyword arguments to the command.
Differential Revision: https://reviews.llvm.org/D115449
This file has completely wrong formatting,
and modifying it leads to having to fight around that. every time.
This is a pure reformatting, there are *NO* other changes here.
If we look at potentially interfering accesses we need to ensure the
"IsExact" flag is set appropriately. Accesses that have an "unknown"
size or offset cannot be exact matches and we missed to flag that.
Error and test reported by Serguei N. Dmitriev.
All paths (that actually do anything) require a successful dyn_cast<CallBase> - so just earlyout if the cast fails
Fixes static analyzer nullptr deference warning
Previously we would persist the flags indicating whether the remote side
supports a particular feature across reconnects, which is obviously not
a good idea.
I implement the clearing by nuking (its the only way to be sure :) the
entire GDBRemoteCommunication object in the disconnect operation and
creating a new one upon connection. This allows us to maintain a nice
invariant that the GDBRemoteCommunication object (which is now a
pointer) exists only if it is connected. The downside to that is that a
lot of functions now needs to check the validity of the pointer instead
of blindly accessing the object.
The process communication does not suffer from the same issue because we
always destroy the entire Process object for a relaunch.
Differential Revision: https://reviews.llvm.org/D116539
This revision fixes SubviewOp, InsertSliceOp, ExtractSliceOp construction during bufferization
where not all offset/size/stride operands were properly specified.
A test that exhibited problematic behaviors related to incorrect memref casts is introduced.
Init tensor optimization is disabled in teh testing func bufferize pass.
Differential Revision: https://reviews.llvm.org/D116899
Currently, the transfer function returns a new lattice element, which forces an
unnecessary copy on processing each CFG statement.
Differential Revision: https://reviews.llvm.org/D116834
init_tensor elimination is arguably a pre-optimization that should be separated from comprehensive bufferization.
In any case it is still experimental and easily results in wrong IR with violated SSA def-use orderings.
Isolate the optimization behind a flag, separate the test cases and add a test case that would results in wrong IR.
Differential Revision: https://reviews.llvm.org/D116936
The code in VPWidenCanonicalIVRecipe::execute only worked for fixed-width
vectors due to the way we generate the values per lane. This patch changes
the code to use a combination of vector splats and step vectors to get
the same result. This then works for both fixed-width and scalable vectors.
Tests that exercise this code path for scalable vectors have been added here:
Transforms/LoopVectorize/AArch64/sve-tail-folding.ll
Differential Revision: https://reviews.llvm.org/D113180
SROA has 3 data-structures where it stores sets of instructions that should
be deleted:
- DeadUsers -> instructions that are UB or have no users
- DeadOperands -> instructions that are UB or operands of useless phis
- DeadInsts -> "dead" instructions, including loads of uninitialized memory
with users
The first 2 sets can be RAUW with poison instead of undef. No brainer as UB
can be replaced with poison, and for instructions with no users RAUW is a
NOP.
The 3rd case cannot be currently replaced with poison because the set mixes
the loads of uninit memory. I leave that alone for now.
Another case where we can use poison is in the construction of vectors from
multiple loads. The base vector for the first insertelement is now poison as
it doesn't matter as it is fully overwritten by inserts.
Differential Revision: https://reviews.llvm.org/D116887