If most elements of BUILD_VECTOR are the same, with a few different
elements, it is better to use DUP for the common elements and
INSERT_VECTOR_ELT for the different elements.
Currently this transform is guarded quite restrictively to only trigger
in clearly beneficial cases.
With D90176, the lowering for patterns originating from code like
` float32x4_t y = {a,a,a,0};` (common in 3D apps) are lowered even
better (unnecessary fmov is removed).
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D90233
https://reviews.llvm.org/D88060
This adds the following combines
1) build_vector formation from insert_vec_elts
2) insert_vec_elts (build_vector) -> build_vector
Check for duplicate clauses associated with directive. Clauses can appear only once
in the 4 lists associated with each directive (allowedClauses, allowedOnceClauses,
allowedExclusiveClauses, requiredClauses). Duplicates were already present (removed with this
patch) or were introduce in new patches by mistake (D89861).
Reviewed By: kiranchandramohan
Differential Revision: https://reviews.llvm.org/D90241
Fix an out-of-bounds shift in emitLegacyZExt by using a slightly more
complicated dwarf expression to create the zext mask.
This addresses a UBSan diagnostic seen when compiling compiler-rt
(llvm.org/PR47927).
rdar://70307714
Differential Revision: https://reviews.llvm.org/D89838
Make the virtual method Toolchain::GetDefaultStackProtectorLevel()
return an explict enum value rather than an integral constant. This
makes the code subjectively easier to read, and should help prevent bugs
that may (or may never) arise from changing the enum values. Previously,
these were just kept in sync via a comment, which is brittle. The trade
off is including a additional header in a few new places. It is not
necessary, but in my opinion helps the readability.
Split off from https://reviews.llvm.org/D90194 to help cut down on lines
changed in code review.
Reviewed By: phosek
Differential Revision: https://reviews.llvm.org/D90271
Define the __vector_pair and __vector_quad types that are used to manipulate
the new accumulator registers introduced by MMA on PowerPC. Because these two
types are specific to PowerPC, they are defined in a separate new file so it
will be easier to add other PowerPC specific types if we need to in the future.
Differential Revision: https://reviews.llvm.org/D81508
Completing the series of FIXME removals for special-case intrinsics:
50dfa19cc7f2c25c7079c963bde01501ea93d85d
This one looks quite different than the others. The size/blended
cost is still potentially very far off from the throughput cost,
but this is hopefully not worse on the whole. It looks like the
underlying costs for the expanded shift/logic have their own
cost-kind limitations. Also, we are not asking the target if
it has a legal funnel shift op, so we just assume that the
intrinsic gets expanded.
Getting the body of a Module is a common need which justifies a
dedicated accessor instead of forcing users to go through the
region->blocks->front unwrapping manually.
Differential Revision: https://reviews.llvm.org/D90287
This patch modifies the Clang AVR toolchain so that it always passes
the '-Tdata=0x800100' to the linker for ATmega328 devices. This matches
AVR-GCC behaviour, and also corresponds to the address of the start of
the data section in data space according to the ATmega328 datasheet.
Without this, clang does not produce a valid ATmega328 binary.
When targeting all non-ATmega328 chips, a warning will be emitted due to
the fact that proper handling for the chips data section address is
not yet implemented.
I've held off adding other microcontrollers for now, mostly because the
AVR toolchain logic is smeared across LLVM core TableGen files, and two Clang
libraries. The 'family detection' logic is also only implemented for
ATmega328 at the moment, for similar reasons.
In the future, I aim to write an RFC to llvm-dev to find a better way
for LLVM to expose target-specific details such as these to compiler
frontends.
Differential Revision: https://reviews.llvm.org/D86629
both deferred_globals.cpp namespace.cpp require lldb in order to run and will
fail if it's not available.
add the required lines to the top of the tests.
As proposed in https://github.com/WebAssembly/simd/pull/376. This commit
implements new builtin functions and intrinsics for these instructions, but does
not yet add them to wasm_simd128.h because they have not yet been merged to the
proposal. These are the first instructions with opcodes greater than 0xff, so
this commit updates the MC layer and disassembler to handle that correctly.
Differential Revision: https://reviews.llvm.org/D90253
Remove the need to heap allocate a string for each style option lookup while reading or writing options.p
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D90244
Basically, this just improves the dump of the Value stored within a location.
If the defining instruction number is zero, it means it is "live-in".
Before the patch:
ESI --> bb 0 inst 0 loc ESI
After:
ESI --> Value{bb: 0, inst: live-in, loc: ESI}
This is an NFC.
Differential Revision: https://reviews.llvm.org/D90309
[libomptarget][nvptx] Undef, weak shared variables
Shared variables on nvptx, and LDS on amdgcn, are uninitialized at
the start of kernel execution. Therefore create the variables with
undef instead of zeros, motivated in part by the amdgcn back end
rejecting LDS+initializer.
Common is zero initialized, which seems incompatible with shared. Thus
change them to weak, following the direction of
https://reviews.llvm.org/rG7b3eabdcd215
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D90248
deferred_globals.cpp: Verify that debug information for a local variable does
not hide a global definition that has the same name
namespace.cpp: Ensure that the debug information for a global variable
includes namespace information.
Differential Revision: https://reviews.llvm.org/D89462
Author: Nabeel Omer <nabeel.omer@sony.com>
In rG6d656c9691d4 I had to relax the check from
`CONTENT: 21 3c 61 72 63 68 3e 0a 12{{$}}`
to
`CONTENT: 21 3c 61 72 63 68 3e 0a 12`
to fix the FreeBSD bot quickly: http://lab.llvm.org:8011/#/builders/28/builds/547
It turns out that "od" prints a trailing white space on FreeBSD, that is
why EOL mark ({{$}}) can't be used. But we still want to check the output size.
This patch adds a check of output size with "wc -c", similar to how it is done
below in the same test. This restores the original strictness.
The explicit call to `std::setlocale(LC_ALL, "C")` isn't required, since
the Standard already says the equivalent of this call is performed on
program startup.