Name resolution fails with a bogus "is not a variable" error message
when a host-associated object appears in a NAMELIST group. The root
cause is that ConvertToObjectEntity() returns false for host-associated
objects. Fix that, and also apply a similar fix to ConvertToProcEntity()
nearby.
Differential Revision: https://reviews.llvm.org/D124541
The current darwin abort_on_error test specifically tests for a division
by zero undefined behavior. However arm does not trap by default for this
behavior. x86 signals the abort, which is why the test passes on x86.
This patch updates the test to test for a case where the ubsan runtime
specifically calls Die() to trigger an abort by default.
rdar://92108564
Differential Revision: https://reviews.llvm.org/D124480
Pass the --compress-debug-sections=zlib argument to the linker when
the use of compressed debug info is requested.
Differential Revision: https://reviews.llvm.org/D114115
The default implementation of findCommutedOpIndices picks the
first two source operands. That's exactly what we want for the
scalar FMA instructions.
Reviewed By: reames
Differential Revision: https://reviews.llvm.org/D124463
Introduced masks where they are not added and improved target dependent
cost models to avoid returning of the incorrect cost results after
adding masks.
Differential Revision: https://reviews.llvm.org/D100486
- Don't reset cur_line_offset to llvm::None when we don't have next_line_offset, because we may need to reuse it in new range after a code end.
- Don't use CombineConsecutiveEntriesWithEqualData for inline_site_sp->ranges, because that will combine consecutive entries with same data in the vector regardless of the entry's range. Originally, I thought that it only combine consecutive entries if adjacent entries' ranges are adjoining or intersecting with each other.
This syntax allows to modify RUN lines based on features
available. For example:
RUN: ... | FileCheck %s --check-prefix=%if windows %{CHECK-W%} %else %{CHECK-NON-W%}
CHECK-W: ...
CHECK-NON-W: ...
The whole command can be put under %if ... %else:
RUN: %if tool_available %{ %tool %} %else %{ true %}
or:
RUN: %if tool_available %{ %tool %}
If tool_available feature is missing, we'll have an empty command in
this RUN line. LIT used to emit an error for empty commands, but now
it treats such commands as nop in all cases.
Multi-line expressions are also supported:
RUN: %if tool_available %{ \
RUN: %tool \
RUN: %} %else %{ \
RUN: true \
RUN: %}
Background and motivation:
D121727 [NVPTX] Integrate ptxas to LIT tests
https://reviews.llvm.org/D121727
Differential Revision: https://reviews.llvm.org/D122569
When a block containing llvm.coro.id is cloned during CHR, it inserts an invalid
PHI node with token type to the beginning of the block containing llvm.coro.begin.
To avoid such case, we exclude regions with llvm.coro.id.
Reviewed By: ChuanqiXu
Differential Revision: https://reviews.llvm.org/D124418
By using a shared index pool, we reduce the footprint of each "Element"
in the COO scheme and, in addition, reduce the overhead of allocating
indices (trading many allocations of vectors for allocations in a single
vector only). When the capacity is known, this means *all* allocation
can be done in advance.
This is a big win. For example, reading matrix SK-2005, with dimensions
50,636,154 x 50,636,154 and 1,949,412,601 nonzero elements improves
as follows (time in ms), or about 3.5x faster overall
```
SK-2005 before after speedup
---------------------------------------------
read 305,086.65 180,318.12 1.69
sort 2,836,096.23 510,492.87 5.56
pack 364,485.67 312,009.96 1.17
---------------------------------------------
TOTAL 3,505,668.56 1,002,820.95 3.50
```
Reviewed By: bixia
Differential Revision: https://reviews.llvm.org/D124502
Renamed test/Analysis/CostModel/X86/splat-load.ll to shuffle-load.ll
to align it with AArch64's similar test.
Also added a complete list of checks for all vector combinations up to 512-bits.
Differential Revision: https://reviews.llvm.org/D124528
Constants in MLIR are not globally unique, unlike that in LLVM IR.
Therefore, reusing previous-translated constants might cause the user
operations not being dominated by the constant (because the
previous-translated ones can be placed in arbitrary place)
This indeed misses some opportunities where we actually can reuse a
previous-translated constants, but verbosity is not our first priority
here.
Differential Revision: https://reviews.llvm.org/D124404
More specifically, the llvm::Instruction generated by
llvm::ConstantExpr::getAsInstruction. Such Instruction will be deleted
right away, but it's possible that when getAsInstruction is called
again, it will create a new Instruction that has the same address with
the one we just deleted. Thus, we shouldn't keep it in the `instMap` to
avoid a conflicting index that triggers an assertion in
processInstruction.
Differential Revision: https://reviews.llvm.org/D124402
And move importer test files from `test/Target/LLVMIR` into
`test/Target/LLVMIR/Import`.
We simply translate struct-type ConstantAggregate(Zero) into a
serious of `llvm.insertvalue` operations against a `llvm.undef` root.
Note that this doesn't affect the original logics on translating
vector/array-type ConstantAggregate values.
Differential Revision: https://reviews.llvm.org/D124399
Similarly to LBOUND in https://reviews.llvm.org/D123237, fix UBOUND() folding
for constant arrays (for both w/ and w/o DIM=): convert
GetConstantArrayLboundHelper into common helper class for both lower/upper
bounds.
Reviewed By: jeanPerier
Differential Revision: https://reviews.llvm.org/D123520
This is necessary to handle conversions of operations defined at runtime in extensible dialects.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D124353
DXIL doesn't support attributes added after LLVM 3.7. The DXILPrepare
pass removes those attributes so they should never be present by the
time we reach the DXIL bitcode writer.
In the event that we somehow try to write a newer attribute in the DXIL
writer, we should fail hard (crash), because the output would be
invalid. This case should only be possible if the DXIL writer were
called without DXILPrepare being run first, which shouldn't be possible.
This patch also adds a default case to the switch statement over the
attribute list which covers all the removed cases and any new attribute
kinds that may be added in the future. The default case is handled like
other unsupported cases by a call to llvm_unreachable.
We dropped downstream support for Python 2 in the previous release. Now
that we have branched for the next release the window where this kind of
change could introduce conflicts is closing too. Remove Python 2 checks
from the test suite.
Differential revision: https://reviews.llvm.org/D124429
We dropped downstream support for Python 2 in the previous release. Now
that we have branched for the next release the window where this kind of
change could introduce conflicts is closing too. Start by getting rid of
Python 2 support in the Script Interpreter plugin.
Differential revision: https://reviews.llvm.org/D124429
An application can use the mere fact of epoll_wait returning an fd
as synchronization with the write on the fd that triggered the notification.
This pattern come up in an internal networking server (b/229276331).
If an fd is added to epoll, setup a link from the fd to the epoll fd
and use it for synchronization as well.
Reviewed By: melver
Differential Revision: https://reviews.llvm.org/D124518
Only supports addition and multiplication for now; other cases
to be implemented.
Reviewed By: hanchung
Differential Revision: https://reviews.llvm.org/D124380
`index` type is converted to `i32` in SPIR-V. This is fine to
support for all signed/unsigned ops.
Reviewed By: hanchung
Differential Revision: https://reviews.llvm.org/D124451
Summary:
We provide the `-f(no-)openmp-new-driver` option to allow users to use
the old or new driver. Previously this wasn't handled in the expected
way and only `-fno-openmp-new-driver` was checked. This patch fixes that
by using the `hasFlag` method as is standard.
As older waves execute long sequences of VALU instructions, this may
prevent younger waves from address calculation and then issuing their
VMEM loads, which in turn leads the VALU unit to idle. This patch tries
to prevent this by temporarily raising the wave's priority.
Reviewed By: foad
Differential Revision: https://reviews.llvm.org/D124246
IRCE is a function pass that operates on loops. If there are no loops in
the function (as seen through LI), we should avoid computing the
remaining expensive analyses (such as BPI). Reordered the analyses
requests and early return if there are no loops. This is an NFC with
compile time improvement.
The same will be done in a follow-up patch for the loop vectorizer.
Reviewed-By: nikic
Differential Revision: https://reviews.llvm.org/D124478
The test was broken (in the sense that it was not testing what it was
supposed to test) in two ways:
- a Makefile refactor caused it to stop being built with
-flimit-debug-info
- clang's constructor homing changed the "home" of the type
This patch fixes the Makefile, and modifies the source code to produce
the same result with both type homing strategies. Due to constructor
homing I had to use a different implicitly-defined function for the test
-- I chose the assignment operator.
I also added some sanity checks to the test to ensure that the test is
indeed operating on limited debug info.
Add tests with selects that match both logical AND and logical OR. Note
that some of the tests get miscompiled at the moment.
Also moves a related test to the newly added test file.
Cuurently we always export STATEPOINT results (GC pointers lowered via VRegs)
to virtual registers. When processing gc.relocate instructions we have to
generate CopyFromRegs node and then export it to VReg again if gc.relocate
is used in other basic blocks. This results in generation of extra COPY MIR
instruction if statepoint and its gc.relocate are in the same BB, but gc.relocate
result is used in other blocks.
This patch changes this behavior to export statepoint results only if used
in other basic blocks. For local uses StatepointLoweringState.(get|set)Location()
API is used to communicate appropriate statepoint result from `LowerStatepoint()`
to `visitGCRelocate()`
This is NFC and is purely compile time optimization. On big methids it can improve
codegen compile time up to 10%.
Reviewed By: reames
Differential Revision: https://reviews.llvm.org/D124444
Given a larger-than-legal shuffle mask, the final codegen will split
into multiple sub-vectors. This attempts to model that in
AArch64TTIImpl::getShuffleCost, splitting masks up according to the size
of the legalized vectors. If the sub-masks have at most 2 input sources
we can call getShuffleCost on them and sum the costs, to get a more
accurate final cost for the entire shuffle. The call to
improveShuffleKindFromMask helps to improve the shuffle kind for the
sub-mask cost call.
Differential Revision: https://reviews.llvm.org/D123414
Lowering of FailImage statement generates a runtime call and the
unreachable operation. The unreachable operation cannot terminate
a structured operation like the IF operation, hence mark as
unstructured.
Note: This patch is part of upstreaming code from the fir-dev branch of
https://github.com/flang-compiler/f18-llvm-project.
Reviewed By: clementval
Differential Revision: https://reviews.llvm.org/D124520
Co-authored-by: Eric Schweitz <eschweitz@nvidia.com>