When we are creating jobs for the new driver we first check the cache to
see if the job was already created as a part of the offloading
toolchain. This would sometimes fail if the bound architecture was set
for the host during offloading. We want to ingore this because it is not
relevant for looking up host actions. Previously it was set on some
machines and would cause the cache lookup to fail.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D118858
Removing the base class of std::basic_string is not an ABI break, so we can remove any references to it from the header.
Reviewed By: ldionne, Mordante, #libc
Spies: libcxx-commits
Differential Revision: https://reviews.llvm.org/D118733
* Give I[1] and I[-1] a name:
- Easier to understand
- Easier to debug (since you don't go through operator[] everytime)
* TheLine->First != TheLine->Last follows since last is a l brace and
first isn't.
* Factor the check for is(tok::l_brace) out.
* Drop else after return.
Differential Revision: https://reviews.llvm.org/D115060
This commit changes the condition of requiring comment to start with
alphanumeric characters to make no change only for a certain set of
characters, currently horizontal whitespace and punctuation characters,
to support wider set of leading characters unrelated to documentation
generation directives.
Reviewed By: HazardyKnusperkeks
Differential Revision: https://reviews.llvm.org/D118869
A significant number of our tests in C accidentally use functions
without prototypes. This patch converts the function signatures to have
a prototype for the situations where the test is not specific to K&R C
declarations. e.g.,
void func();
becomes
void func(void);
This is the first batch of tests being updated (there are a significant
number of other tests left to be updated).
The convert only worked on CUs in main binary.
If it's a skeleton CU it will now use the DWO CU
when invoking handleDie.
Test Plan:
llvm-lit
Reviewed By: clayborg
Differential Revision: https://reviews.llvm.org/D118521
The idea here is to have a verify routine we can call during scheduling to ensure broken invariants are reported. The intent is to help in debugging scheduling bugs.
At the moment, only the most basic properties are checked as adding several I thought held reported failures.
Induction variable calculation was ignoring scf.for step value. Fix it to get
the correct induction variable value in the prologue.
Differential Revision: https://reviews.llvm.org/D118932
Extend "fallthrough" to allow a third option: "fallback". Fallthrough
allows the original path to used if the redirected (or mapped) path
fails. Fallback is the reverse of this, ie. use the original path and
fallback to the mapped path otherwise.
While this result *can* be achieved today using multiple overlays, this
adds a much more intuitive option. As an example, take two directories
"A" and "B". We would like files from "A" to be used, unless they don't
exist, in which case the VFS should fallback to those in "B".
With the current fallthrough option this is possible by adding two
overlays: one mapping from A -> B and another mapping from B -> A. Since
the frontend *nests* the two RedirectingFileSystems, the result will
be that "A" is mapped to "B" and back to "A", unless it isn't in "A" in
which case it fallsthrough to "B" (or fails if it exists in neither).
Using "fallback" semantics allows a single overlay instead: one mapping
from "A" to "B" but only using that mapping if the operation in "A"
fails first.
"redirect-only" is used to represent the current "fallthrough: false"
case.
Differential Revision: https://reviews.llvm.org/D117937
Fixes segfaults on x86_64 caused by instrumented code running before
shadow is set up.
Reviewed By: pcc
Differential Revision: https://reviews.llvm.org/D118171
GCNDownwardRPTracker RPTracker.reset() skips debug instructions for NextMI so RPTracker.getNext() will never give the beginning of a sched region if it is a debug value. In this case we will never set the live-ins for that block.
Add check to see if getNext also equals the MI after skipping debug instructions.
Reviewed By: rampitec
Differential Revision: https://reviews.llvm.org/D118853
AddressingModeMatcher::matchOperationAddr may attempt to shift a
variable by the same amount of steps as found in the IR in a SHL
instruction. This was done without considering that there could be
undefined behavior in the IR, so the shift performed when compiling
could end up having undefined behavior as well.
This patch avoid UB in the codegenprepare by making sure that we
limit the shift amount used, in a similar way as already being done
in CodeGenPrepare::optimizeLoadExt.
Differential Revision: https://reviews.llvm.org/D118602
VS gives this warning for an integer constant:
AMDGPUGenDAGISel.inc(214687): warning C4334: '<<': result of 32-bit
shift implicitly converted to 64 bits (was 64-bit shift intended?)
- If not using `llvm-config`, `LLVM_MAIN_SRC_DIR` now has a sane default
- `LLVM_CONFIG_PATH` will continue to work for LLD for back compat.
- More quoting of paths in an abundance of caution.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D118792
In AArch64ISelLowering.cpp this patch implements this fold:
1) GEP (%ptr, SHL ((stepvector(A) + splat(%offset))) << splat(B)))
into GEP (%ptr + (%offset << B), step_vector (A << B))
The above transform simplifies the index operand so that it can be expressed
as i32 elements.
This allows using only one gather/scatter assembly instruction instead of two.
Patch by Paul Walker (@paulwalker-arm).
Depends on D117900
Differential Revision: https://reviews.llvm.org/D118345
In AArch64ISelLowering.cpp this patch implements this fold:
GEP (%ptr, (splat(%offset) + stepvector(A)))
into GEP ((%ptr + %offset), stepvector(A))
The above transform simplifies the index operand so that it can be expressed
as i32 elements.
This allows using only one gather/scatter assembly instruction instead of two.
Patch by Paul Walker (@paulwalker-arm).
Depends on D118459
Differential Revision: https://reviews.llvm.org/D117900
Reorder the methods and patterns to move related patterns/methods
closer (textually).
Reviewed By: gysit
Differential Revision: https://reviews.llvm.org/D118870
This fixes the following error:
sanitizer_interface_internal.h:77:7: error: conflicting types for
'__sanitizer_get_module_and_offset_for_pc'
int __sanitizer_get_module_and_offset_for_pc(
common_interface_defs.h:349:5: note: previous declaration is here
int __sanitizer_get_module_and_offset_for_pc(void *pc, char *module_path,
I am getting it on a code that uses sanitizer_common (includes internal headers),
but also transitively gets includes of the public headers in tests
via an internal version of gtest.
Reviewed By: melver
Differential Revision: https://reviews.llvm.org/D118910
This more closely mirrors apple's libtool, and also potentially makes it
clearer for multi-arch archives where the issue lies.
Differential Revision: https://reviews.llvm.org/D118867
This patch adds more documentation for the OpenMP offloading driver.
This includes a new file that describes the overall pipeline becuase
that was not previously explained in full elsewhere.
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D118815
D116804 proposes to alter codegen on this example based on
CPU tuning, so check a variety of models to confirm it works
as expected. We already have this test mixed in with several
others in another test file, but it seems wasteful to add so
many RUN lines to check this difference over and over again.
In some cases, when selecting a (trunc (slr)) pattern, the slr gets translated
to a v_lshrrev_b3e2_e64 instruction whereas the truncation gets selected to
a sequence of v_and_b32_e64 and v_cmp_eq_u32_e64. In the final ISA, this appears
as selecting the nth-bit:
v_lshrrev_b32_e32 v0, 2, v1
v_and_b32_e32 v0, 1, v0
v_cmp_eq_u32_e32 vcc_lo, 1, v0
However, when the value used in the right shift is known at compilation time, the
whole sequence can be reduced to two VALUs when the constant operand in the v_and is adjusted to (1 << lshrrev_operand):
v_and_b32_e32 v0, (1 << 2), v1
v_cmp_ne_u32_e32 vcc_lo, 0, v0
In the example above, the following pseudo-code:
v0 = (v1 >> 2)
v0 = v0 & 1
vcc_lo = (v0 == 1)
would be translated to:
v0 = v1 & 0b100
vcc_lo = (v0 == 0b100)
which should yield an equivalent result.
This is a little bit hard to test as one needs to force the SelectionDAG to
contain the nodes before instruction selection, but the test sequence was
roughly derived from a production shader.
Reviewed By: foad
Differential Revision: https://reviews.llvm.org/D118461
Some translations do work with unregistered dialects, this allows one
to write testcases against them. It works the same way as it does for
mlir-opt.
Differential Revision: https://reviews.llvm.org/D118872