Look up the -arch flags to pass to the mig invocation from an
optionally-defined MIG_ARCHS variable. We can't use CMAKE_OSX_ARCHS
because the {i,tv,watch}OS builds don't use this mechanism to achieve
fat builds (they build each slice separately & then lipo them together).
This supercedes the mig -arch/-isysroot fix from
510758dae2.
Each function is with this compiled with the SystemZSubtarget initialized
from the functions attributes.
Review: Ulrich Weigand.
Differential Revision: https://reviews.llvm.org/D74086
The idiom
for (auto i = n - n; i < n; i += 1)
was intended to automatically derive the type of i from n
(signed/unsigned int) and avoid the 'mixed signed/unsigned comparison'
warning. However, almost-always-auto was never used in the LLVM coding
style (although we used it in Polly for some time) and I did never
intended to use this idiom upstream.
PVS Studio may warns about this idiom as 'warning: both sides of
operator are equivalent [misc-redundant-expression]'.
Remove the use of auto and directly use unsigned.
Also see http://llvm.org/PR44768
This CL refactors EDSCs to layer them better and break unnecessary
dependencies. After this refactoring, the top-level EDSC target only
depends on IR but not on Dialects anymore and each dialect has its
own EDSC directory.
This simplifies the layering and breaks cyclic dependencies.
In particular, the declarative builder + folder are made explicit and
are now confined to Linalg.
As the refactoring occurred, certain classes and abstractions that were not
paying for themselves have been removed.
Differential Revision: https://reviews.llvm.org/D74302
Non-AVX512BW targets failed to concatenate 256-bit shifts back to 512-bits (split during 512-bit shuffle lowering as they don't have v32i16/v64i8 types).
This reverts commit b54a8ec1bc.
The commit triggered debug invariance (different output with/without
-g). The patch seems to have exposed a pre-existing invariance problem
in GlobalOpt, which I'll write a bug report for.
This patch renames `__personality_routine` to `_Unwind_Personality_Fn`
in `unwind.h`. Both `unwind.h` from clang and GCC headers use this name
instead of `__personality_routine`. With this patch one is also able to
build libc++abi with libunwind support on Windows.
Patch by Markus Böck!
The current standard to llvm conversion pass lowers subview ops only if
dynamic offsets are provided. This commit extends the lowering with a
code path that uses the constant offset of the target memref for the
subview op lowering (see Example 3 of the subview op definition for an
example) if no dynamic offsets are provided.
Differential Revision: https://reviews.llvm.org/D74280
These are generated and do not need to have the same values.
We are defining separate subregs for R600 and GCN but then
using AMDGPU subregs on R600.
Differential Revision: https://reviews.llvm.org/D74248
As noted on PR44379, we didn't attempt to lower vector shuffles using bit rotations on XOP/AVX512F targets.
This patch lowers to uniform ISD:ROTL nodes - ROTR isn't supported by XOP and they are interchangeable for constant values anyway.
There might be cases where targets without ISD:ROTL support would benefit from this (expanding to SRL+SHL+OR), which I'll investigate in a future patch.
REAPPLIED rGe82e17d4d4ca after reversion at rG39eade73a567 - fixed offset matching in matchShuffleAsBitRotate.
Summary: Such implementations may override the class's own implementation, and even be a danger in case someone later comes and adds one to the class itself. Most times this has been encountered have been a mistake.
Reviewers: stephanemoore, benhamilton, dmaclach
Reviewed By: stephanemoore, benhamilton, dmaclach
Subscribers: dmaclach, mgorny, cfe-commits
Tags: #clang-tools-extra, #clang
Differential Revision: https://reviews.llvm.org/D72876
Summary:
Nothing critical, just a few potential improvements I've noticed while reading
the code:
- return `false` when symbolizer buffer is too small to read all data
- invert some conditions to reduce indentation
- prefer `nullptr` over `0` for pointers; init some pointers on stack;
- remove minor code duplication
Reviewers: eugenis, vitalybuka
Subscribers: dberris, #sanitizers, llvm-commits, kcc
Tags: #sanitizers, #llvm
Differential Revision: https://reviews.llvm.org/D74137
If a debug line section with version of greater than 5 is encountered,
prior to this change the parser would accept it and treat it as version
5. This might work to some extent, but then it might not at all, as it
really depends on the format of the unspecified future version, which
will be different (otherwise there would be no point in changing the
version number). Any information we could provide has a good chance of
being invalid, so we should just refuse to parse such tables.
Reviewed by: dblaikie, MaskRay
Differential Revision: https://reviews.llvm.org/D74204
This patch makes the following System Registers Read Only:
- CurrentEL
- ICH_MISR_EL2
- PMBIDR_EL1
- PMSIDR_EL1
as found in:
https://developer.arm.com/docs/ddi0595/e/aarch64-system-registers
Relative line numbers were also added to the tests so we get more
informative error messages on failure.
Change-Id: I963b4f01ca5737b58f9e8e7abe9ca1d99e328758
Add a simplification to fuse a manual vector extract with shifts and
truncate into a bitcast.
Unpacking and packing values into vectors is only optimized with
extractelement instructions, not when manually unpacked using shifts
and truncates.
This patch simplifies shifts and truncates into a bitcast if possible.
Simplify (build_vec (trunc $1)
(trunc (srl $1 width))
(trunc (srl $1 (2 * width))) ...)
to (bitcast $1)
Differential Revision: https://reviews.llvm.org/D73892
The existing (default) calling convention for memrefs in standard-to-LLVM
conversion was motivated by interfacing with LLVM IR produced from C sources.
In particular, it passes a pointer to the memref descriptor structure when
calling the function. Therefore, the descriptor is allocated on stack before
the call. This convention leads to several problems. PR44644 indicates a
problem with stack exhaustion when calling functions with memref-typed
arguments in a loop. Allocating outside of the loop may lead to concurrent
access problems in case the loop is parallel. When targeting GPUs, the contents
of the stack-allocated memory for the descriptor (passed by pointer) needs to
be explicitly copied to the device. Using an aggregate type makes it impossible
to attach pointer-specific argument attributes pertaining to alignment and
aliasing in the LLVM dialect.
Change the default calling convention for memrefs in standard-to-LLVM
conversion to transform a memref into a list of arguments, each of primitive
type, that are comprised in the memref descriptor. This avoids stack allocation
for ranked memrefs (and thus stack exhaustion and potential concurrent access
problems) and simplifies the device function invocation on GPUs.
Provide an option in the standard-to-LLVM conversion to generate auxiliary
wrapper function with the same interface as the previous calling convention,
compatible with LLVM IR porduced from C sources. These auxiliary functions
pack the individual values into a descriptor structure or unpack it. They also
handle descriptor stack allocation if necessary, serving as an allocation
scope: the memory reserved by `alloca` will be freed on exiting the auxiliary
function.
The effect of this change on MLIR-generated only LLVM IR is minimal. When
interfacing MLIR-generated LLVM IR with C-generated LLVM IR, the integration
only needs to require auxiliary functions and change the function name to call
the wrapper function instead of the original function.
This also opens the door to forwarding aliasing and alignment information from
memrefs to LLVM IR pointers in the standrd-to-LLVM conversion.
The DebugInfo/dwarfdump-invalid-line-table test used a pre-canned binary
generated by a fuzzer to demonstrate a bug fix. Unfortunately, the
binary is rigid and requires hand-editing if we change behaviour, such
as rejecting certain properties within it (as I plan on doing in another
change).
Rather than hand-edit the binary, I have replaced it with two tests. The
first tests the high-level code path from the debug line parser that
produces the same error as this test previously did, and the second is a
set of unit test cases that comprehensively cover the
FormValue::skipValue method, which in turn covers the area that the
original bug fix touched.
Reviewed by: MaskRay, dblaikie
Differential Revision: https://reviews.llvm.org/D74202
This change implements the llvm intrinsic llvm.read_register for
the SystemZ platform which returns the value of the specified
register
(http://llvm.org/docs/LangRef.html#llvm-read-register-and-llvm-write-register-intrinsics).
This implementation returns the value of the stack register, and
can be extended to return the value of other registers. The
implementation for this intrinsic exists on various other platforms
including Power, x86, ARM, etc. but missing on SystemZ.
Reviewers: uweigand
Differential Revision: https://reviews.llvm.org/D73378
Both methods have compile time constraints that we should test against.
Patch by Michael Schellenberger Costa
Differential Revision: https://reviews.llvm.org/D71999
On macOS, libc++ headers are distributed with the compiler, not
the sysroot. Without this, compiling a file that includes something
like <string> won't compile with gn-built clang without manual tweaks.
I used to do the manual tweaks, but now that other people are starting
to use this on mac, let's make it Just Work.
(This is marginally nicer than the cmake build now in that you can
just build 'clang' and it'll do the right thing.)
Differential Revision: https://reviews.llvm.org/D74247
As noted on PR44379, we didn't attempt to lower vector shuffles using bit rotations on XOP/AVX512F targets.
This patch lowers to uniform ISD:ROTL nodes - ROTR isn't supported by XOP and they are interchangeable for constant values anyway.
There might be cases where targets without ISD:ROTL support would benefit from this (expanding to SRL+SHL+OR), which I'll investigate in a future patch.
Also, non-AVX512BW targets fail to concatenate 256-bit rotations back to 512-bits (split during shuffle lowering as they don't have v32i16/v64i8 types).
---
Internal shuffle tests indicate theres a bug somewhere that I haven't been able to track down yet.
m_size can only be 1 or 0 and indicates if the optional has a value. Calling
it 'm_size', giving it a size_t data type and then also comparing indices against
'size' is very confusing. Let's just make this a bool.
This patch adds a first version of a MemorySSA based DSE. It is missing
a lot of features, which will get added as follow-ups, to help to keep
the review manageable.
The patch uses the following general approach: given a MemoryDef, walk
upwards to find clobbering MemoryDefs that may be killed by the
starting def. Then check that there are no uses that may read the
location of the original MemoryDef in between both MemoryDefs. A bit
more concretely:
For all MemoryDefs StartDef:
1. Get the next dominating clobbering MemoryDef (DomAccess) by walking upwards.
2. Check that there no reads between DomAccess and the StartDef by checking
all uses starting at DomAccess and walking until we see StartDef.
3. For each found DomDef, check that:
1. There are no barrier instructions between DomDef and StartDef (like
throws or stores with ordering constraints).
2. StartDef is executed whenever DomDef is executed.
3. StartDef completely overwrites DomDef.
4. Erase DomDef from the function and MemorySSA.
The patch uses a very simple approach to guarantee that no throwing
instructions are between 2 stores: We only allow accesses to stack
objects, access that are in the same basic block if the block does not
contain any throwing instructions or accesses in functions that do
not contain any throwing instructions. This will get lifted later.
Besides adding support for the missing cases, there is plenty of additional
potential for improvements as follow-up work, e.g. the way we visit stores
(could be just a traversal of the MemorySSA, rather than collecting them
up-front), using the alias information discovered during walking to optimize
the MemorySSA.
This is loosely based on D40480 by Dave Green.
Reviewers: dmgreen, rnk, efriedma, bryant, asbirlea, Tyker
Reviewed By: asbirlea
Differential Revision: https://reviews.llvm.org/D72700