blockUntilIdle of a parent can't always be correctly implemented as
return ChildA.blockUntilIdle() && ChildB.blockUntilIdle()
The problem is that B can schedule work on A while we're waiting on it.
I believe this is theoretically possible today between CDB and background index.
Modules open more possibilities and it's hard to reason about all of them.
I don't have a perfect fix, and the abstraction is too good to lose. this patch:
- calls out why we block on workscheduler first, and asserts correctness
- documents the issue
- reduces the practical possibility of spuriously returning true significantly
This function is ultimately only for testing, so we're driving down flake rate.
Differential Revision: https://reviews.llvm.org/D96856
__start_/__stop_ references retain C identifier name sections such as
__llvm_prf_*. Putting these into a section group disables this logic.
The ELF section group semantics ensures that group members are retained
or discarded as a unit. When a function symbol is discarded, this allows
allows linker to discard counters, data and values associated with that
function symbol as well.
Note that `noduplicates` COMDAT is lowered to zero-flag section group in
ELF. We only set this for functions that aren't already in a COMDAT and
for those that don't have available_externally linkage since we already
use regular COMDAT groups for those.
Differential Revision: https://reviews.llvm.org/D96757
We don't yet have working codegen for the resulting unmerges, and if
we did it would probably be horrible.
Differential Revision: https://reviews.llvm.org/D97035
/reproduce: now works correctly with:
- /call-graph-ordering-file:
- /def:
- /natvis:
- /order:
- /pdbstream:
I went through all instances of MemoryBuffer::getFile() and made sure
everything that didn't already do so called takeBuffer().
For natvis, that wasn't possible since DebugInfo/PDB wants to take
owernship of the natvis buffer. For that case, I'm manually adding the
tar file entry.
/natvis: and /pdbstream: is slightly awkward, since createResponseFile()
always adds these flags to the response file but createPDB() (which
ultimately adds the files referenced by the flags) is only called if
/debug is also passed. So when using /natvis: without /debug with
/reproduce:, lld won't warn, but when linking using the response
file from the archive, it won't find the natvis file since it's not
in the tar. This isn't a new issue though, and after this patch things
at least work with using /natvis: _with_ debug with /reproduce:.
(Same for /pdbstream:)
Differential Revison: https://reviews.llvm.org/D97212
This adds support for serialization of `WasmEHFuncInfo`, in the form of
<Source BB Number, Unwind destination BB number>. To make YAML mapping
work, we needed to make a copy of the existing `SrcToUnwindDest` map
within `yaml::WebAssemblyMachineFunctionInfo`.
It was hard to add EH MIR tests for CFGStackify because `WasmEHFuncInfo`
could not be read from test MIR files. This adds the serialization
support for that to make EH MIR tests easier.
Reviewed By: dschuff
Differential Revision: https://reviews.llvm.org/D97174
This renames variable and method names in `WasmEHFuncInfo` class to be
simpler and clearer. For example, unwind destinations are EH pads by
definition so it doesn't necessarily need to be included in every method
name. Also I am planning to add the reverse mapping in a later CL,
something like `UnwindDestToSrc`, so this renaming will make meanings
clearer.
Reviewed By: dschuff
Differential Revision: https://reviews.llvm.org/D97173
This patch responds to a comment from @vitalybuka in D96203: suggestion to
do the change incrementally, and start by modifying this file name. I modified
the file name and made the other changes that follow from that rename.
Reviewers: vitalybuka, echristo, MaskRay, jansvoboda11, aaron.ballman
Differential Revision: https://reviews.llvm.org/D96974
This should fix the issue reported in D96972.
I don't have a good test case for this without those changes.
Differential Revision: https://reviews.llvm.org/D97082
Currently exception.mir runs LateEHPrepare and CFGStackify, but some
tests I'm planning to add shouldn't be run with LateEHPrepare, because
it is convenient to only run CFGStackify when testing things like unwind
mismatches and it is easier to add tests that are in phase right before
CFGStackify. This splits existing exception.mir into two files;
cfg-stackify-eh.mir will only run CFGStackify. Note that
`eh_label_tests` tests both LateEHPrepare and CFGStackify, so it is
still in exception.mir. `rethrow_arg_tests` has been converted to the
post-LateEHPrepare form to be moved into cfg-stackify-eh.mir, like
removing `CATCHRET` and such, because it does not really test anything
in LateEHPrepare.
Reviewed By: dschuff
Differential Revision: https://reviews.llvm.org/D97175
Pointer operand of scatter loads does not remain scalar in the tree (it
gest vectorized) and thus must not be marked as the scalar that remains
scalar in vectorized form.
Differential Revision: https://reviews.llvm.org/D96818
The config option for this changed in
https://secure.phabricator.com/D21313 (or when that was rolled out).
I'm leaving the old config option, which may be in use by people with
older versions of arc.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D97215
The implementation of tuple's constructors and assignment operators
currently diverges from the way the Standard specifies them, which leads
to subtle cases where the behavior is not as specified. In particular, a
class derived from a tuple-like type (e.g. pair) can't be assigned to a
tuple with corresponding members, when it should. This commit re-implements
the assignment operators (BUT NOT THE CONSTRUCTORS) in a way much closer
to the specification to get rid of this bug. Most of the tests have been
stolen from Eric's patch https://reviews.llvm.org/D27606.
As a fly-by improvement, tests for noexcept correctness have been added
to all overloads of operator=. We should tackle the same issue for the
tuple constructors in a future patch - I'm just trying to make progress
on fixing this long-standing bug.
PR17550
rdar://15837420
Differential Revision: https://reviews.llvm.org/D50106
- Fix `preds` comments
- Delete nonexistent attributes in instructions (They used to exist in
clang-generated files, but I removed most of them to make the tests
tidy. We have only `nounwind` and `noreturn` left here.)
- Add missing `Function Attrs` comments in function declarations
None of these affect test function semantics or test results for now.
Reviewed By: tlively
Differential Revision: https://reviews.llvm.org/D97179
Prevent warning when the values are initialized using fields that will be initialized later or VarDecls defined in the constructors body.
Both of these cases can't be safely fixed.
Also improve logic of finding where to insert member initializers, previously it could be confused by in class member initializers.
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D97132
This code alleviates some pathological loop parameters (lower,
upper, stride) within calculations involved in the static loop code. It
bounds the chunk size to the trip count if it is greater than the trip
count and also minimizes problematic code for when trip count < nth.
Differential Revision: https://reviews.llvm.org/D96426
The current implementation of tilePerfectlyNested utility doesn't handle
the non-unit step size. We have added support to perform tiling
correctly even if the step size of the loop to be tiled is non-unit.
Fixes https://bugs.llvm.org/show_bug.cgi?id=49188.
Differential Revision: https://reviews.llvm.org/D97037
Only attempt shutdown if lpReserved is NULL. The Windows documentation
states:
When handling DLL_PROCESS_DETACH, a DLL should free resources such as
heap memory only if the DLL is being unloaded dynamically (the
lpReserved parameter is NULL). If the process is terminating (the
lpReserved parameter is non-NULL), all threads in the process except the
current thread either have exited already or have been explicitly
terminated by a call to the ExitProcess function, which might leave some
process resources such as heaps in an inconsistent state. In this case,
it is not safe for the DLL to clean up the resources. Instead, the DLL
should allow the operating system to reclaim the memory.
Differential Revision: https://reviews.llvm.org/D96750
This patch limits the number of dispatch buffers (used for
loop worksharing construct) to between 1 and 4096.
Differential Revision: https://reviews.llvm.org/D96749
This patch adds Linalg named ops for various types of integer matmuls.
Due to limitations in the tc spec/linalg-ods-gen ops cannot be type
polymorphic, so this instead creates new ops (improvements to the
methods for defining Linalg named ops are underway with a prototype at
https://github.com/stellaraccident/mlir-linalgpy).
To avoid the necessity of directly referencing these many new ops, this
adds additional methods to ContractionOpInterface to allow classifying
types of operations based on their indexing maps.
Reviewed By: nicolasvasilache, mravishankar
Differential Revision: https://reviews.llvm.org/D97006
operands[2] can be nullptr here. I'm not able to build a lit test for
this because of the commutative reordering of operands. It's possible to
trigger this with a createOrFold<BroadcastOp> though.
Differential Revision: https://reviews.llvm.org/D97206
ICMP_NE predicates cannot be directly represented as constraint. But we
can use ICMP_UGT instead ICMP_NE for %x != 0.
See https://alive2.llvm.org/ce/z/XlLCsW
A previous patch moved the index versions. This moves the rest.
I also removed the custom lowering for VLEFF since we can now
do everything directly in the isel handling.
I had to update getLMUL to handle mask registers to index the
pseudo table correctly for VLE1/VSE1.
This is good for another 15K reduction in llc size.
Reviewed By: frasercrmck
Differential Revision: https://reviews.llvm.org/D97097
This commit fixes a bug in affine fusion pipeline where an
incorrect fusion is performed despite a dealloc op is present
between a producer and a consumer. This is done by creating a
node for dealloc op in the MDG.
Reviewed By: bondhugula, dcaballe
Differential Revision: https://reviews.llvm.org/D97032
When compiling with ccache, compiler commands get split into smaller steps
and clang's default -Wunused-command-line-argument complains about unused
include directory arguments. In combination -Werror, compilation aborts.
If CMAKE_C_FLAGS contains -Wno-unused-command-line-argument or
-Wno-error=unused-command-line-argument, the latter flag is passed into the
build script.
This is a re-commit. The previous version was reverted because of failing
tests.
Differential Revision: https://reviews.llvm.org/D96762
If the call is readnone, then there may not be any MemoryAccess
associated with the call. Bail out in that case.
This fixes the issue reported at
https://reviews.llvm.org/D94376#2578312.
If a static assert has a message as the right side of an and condition, suggest a fix it of replacing the '&&' to ','.
`static_assert(cond && "Failed Cond")` -> `static_assert(cond, "Failed cond")`
This use case comes up when lazily replacing asserts with static asserts.
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D89065
When cloning instructions during jump threading, also clone and
adapt any declared scopes. This is primarily important when
threading loop exits, because we'll end up with two dominating
scope declarations in that case (at least after additional loop
rotation). This addresses a loose thread from
https://reviews.llvm.org/rG2556b413a7b8#975012.
Differential Revision: https://reviews.llvm.org/D97154
Add -J to the f18 driver for compatibility with gfortran.
Add -module-dir for compatibility with the new flang driver.
They both set the output directory for .mod files and add the
directory to the search list. -module still only does the former.
Clean up the new driver test to match.
Differential Revision: https://reviews.llvm.org/D97164
In current implementation of `deviceRTLs`, we're using some functions
that are CUDA version dependent (if CUDA_VERSION < 9, it is one; otheriwse, it
is another one). As a result, we have to compile one bitcode library for each
CUDA version supported. A worse problem is forward compatibility. If a new CUDA
version is released, we have to update CMake file as well.
CUDA 9.2 has been released for three years. Instead of using various weird tricks
to make `deviceRTLs` work with different CUDA versions and still have forward
compatibility, we can simply drop support for CUDA 9.1 or lower version. It has at
least two benifits:
- We don't need to generate bitcode libraries for each CUDA version;
- Clang driver doesn't need to search for the bitcode lib based on CUDA version.
We can claim that starting from LLVM 12, OpenMP offloading on NVPTX target requires
CUDA 9.2+.
Reviewed By: jdoerfert, JonChesterfield
Differential Revision: https://reviews.llvm.org/D97003