Summary: Just like the FIXME says, there is no alignment requirement for MMX.
Reviewers: RKSimon, zvi, igorb
Reviewed By: RKSimon
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D36815
llvm-svn: 311090
Base::TraverseStmt when visiting the then/else branches of if statements
This ensures that the statement stack is correctly tracked and correct
multi-statement fixit is generated inside of an if (@available)
llvm-svn: 311088
In the case where dfsan provides a custom wrapper for a function,
shadow parameters are added for each parameter of the function.
These parameters are i16s. For targets which do not consider this
a legal type, the lack of sign extension information would cause
LLVM to generate anyexts around their usage with phi variables
and calling convention logic.
Address this by introducing zero exts for each shadow parameter.
Reviewers: pcc, slthakur
Differential Revision: https://reviews.llvm.org/D33349
llvm-svn: 311087
Summary:
Generate the type table from the types used by a target rather than hard-coding
the union of types used by all targets.
Depends on D36084
Reviewers: ab, t.p.northover, qcolombet, rovka, aditya_nandakumar
Reviewed By: rovka
Subscribers: kristof.beyls, igorb, llvm-commits
Differential Revision: https://reviews.llvm.org/D36085
llvm-svn: 311084
- If we have global arrays, we would like to rewrite them to global
pointers which are allocated using `cudaMallocManaged`.
- If we have allocas in a function, we would like to rewrite them to
heap-allocations with `cudaMallocManaged` and `cudaFree`.
- With these rewrite mechanisms, we can offload _any_ function to the
GPU with no code rewrite whatsover.
Differential Revision: https://reviews.llvm.org/D36516
llvm-svn: 311080
Summary:
The current fix will break the compilation -- because braced list is not
deducible in std::make_unique (with the use of forwarding) without
specifying the type explicitly.
We could support it in the future.
Reviewers: alexfh
Reviewed By: alexfh
Subscribers: JDevlieghere, xazax.hun, cfe-commits
Differential Revision: https://reviews.llvm.org/D36786
llvm-svn: 311078
VPlan is an ongoing effort to refactor and extend the Loop Vectorizer. This
patch introduces the VPlan model into LV and uses it to represent the vectorized
code and drive the generation of vectorized IR.
In this patch VPlan models the vectorized loop body: the vectorized control-flow
is represented using VPlan's Hierarchical CFG, with predication refactored from
being a post-vectorization-step into a vectorization planning step modeling
if-then VPRegionBlocks, and generating code inline with non-predicated code. The
vectorized code within each VPBasicBlock is represented as a sequence of
Recipes, each responsible for modelling and generating a sequence of IR
instructions. To keep the size of this commit manageable the Recipes in this
patch are coarse-grained and capture large chunks of LV's code-generation logic.
The constructed VPlans are dumped in dot format under -debug.
This commit retains current vectorizer output, except for minor instruction
reorderings; see associated modifications to lit tests.
For further details on the VPlan model see docs/Proposals/VectorizationPlan.rst
and its references.
Authors: Gil Rapaport and Ayal Zaks
Differential Revision: https://reviews.llvm.org/D32871
llvm-svn: 311077
Summary:
Support the case where an operand of a pattern is also the whole of the
result pattern. In this case the original result and all its uses must be
replaced by the operand. However, register class restrictions can require
a COPY. This patch handles both cases by always emitting the copy and
leaving it for the register allocator to optimize.
The previous commit failed on Windows machines due to a flaw in the sort
predicate which allowed both A < B < C and B == C to be satisfied
simultaneously. The cause of this was some sloppiness in the priority order of
G_CONSTANT instructions compared to other instructions. These had equal priority
because it makes no difference, however there were operands had higher priority
than G_CONSTANT but lower priority than any other instruction. As a result, a
priority order between G_CONSTANT and other instructions must be enforced to
ensure the predicate defines a strict weak order.
Reviewers: ab, t.p.northover, qcolombet, rovka, aditya_nandakumar
Subscribers: javed.absar, kristof.beyls, igorb, llvm-commits
Differential Revision: https://reviews.llvm.org/D36084
llvm-svn: 311076
We would previously crash on next script:
MEMORY { name : ORIGIN = .; }
Patch fixes that.
Differential revision: https://reviews.llvm.org/D36138
llvm-svn: 311073
The idea of this patch is to continue the scheduler state over an MBB boundary
in the case where the successor block has only one predecessor. This means
that the scheduler will continue in the successor block (after emitting any
branch instructions) with e.g. maintained processor resource counters.
Benchmarks have been confirmed to benefit from this.
The algorithm in MachineScheduler.cpp that extracts scheduling regions of an
MBB has been extended so that the strategy may optionally reverse the order
of processing the regions themselves. This is controlled by a new method
doMBBSchedRegionsTopDown(), which defaults to false.
Handling the top-most region of an MBB first also means that a top-down
scheduler can continue the scheduler state across any scheduling boundary
between to regions inside MBB.
Review: Ulrich Weigand, Matthias Braun, Andy Trick.
https://reviews.llvm.org/D35053
llvm-svn: 311072
When v1i1 is legal (e.g. AVX512) the legalizer can reach
a case where a v1i1 SETCC with an illgeal vector type operand
wasn't scalarized (since v1i1 is legal) but its operands does
have to be scalarized. This used to assert because SETCC was
missing from the vector operand scalarizer.
This patch attemps to teach the legalizer to handle these cases
by scalazring the operands, converting the node into a scalar
SETCC node.
Differential revision: https://reviews.llvm.org/D36651
llvm-svn: 311071
If we want to substitute the relocation of derived pointer with gep of base then
we must ensure that relocation of base dominates the relocation of derived pointer.
Currently only check for basic block is present. However it is possible that both
relocation are in the same basic block but relocation of derived pointer is defined
earlier.
The patch moves the relocation of base pointer right before relocation of derived
pointer in this case.
Reviewers: sanjoy,artagnon,igor-laevsky,reames
Reviewed By: reames
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D36462
llvm-svn: 311067
Summary:
This pass detangles induction variables from functions, which take variables by
reference. Most fortran functions compiled with gfortran pass variables by
reference. Unfortunately a common pattern, printf calls of induction variables,
prevent in this situation the promotion of the induction variable to a register,
which again inhibits any kind of loop analysis. To work around this issue
we developed a specialized pass which introduces separate alloca slots for
known-read-only references, which indicate the mem2reg pass that the induction
variables can be promoted to registers and consquently enable SCEV to work.
We currently hardcode the information that a function
_gfortran_transfer_integer_write does not read its second parameter, as
dragonegg does not add the right annotations and we cannot change old dragonegg
releases. Hopefully flang will produce the right annotations.
Reviewers: Meinersbur, bollu, singam-sanjay
Reviewed By: bollu
Subscribers: mgorny, pollydev, llvm-commits
Tags: #polly
Differential Revision: https://reviews.llvm.org/D36800
llvm-svn: 311066
This commit adds the functionality of performing reference counting on the
callee side for Integer Set Library (ISL) to Clang Static Analyzer's
RetainCountChecker.
Reference counting on the callee side can be extensively used to perform
debugging within a function (For example: Finding leaks on error paths).
Patch by Malhar Thakkar!
Differential Revision: https://reviews.llvm.org/D36441
llvm-svn: 311063
This reverts commit r311038.
Several buildbots are breaking, and at least one appears to be due to
the forwarding of physical regs enabled by this change. Reverting while
I investigate further.
llvm-svn: 311062
When lowering a VLA, we emit a __chstk call. However, this call can
internally clobber CPSR. We did not mark this register as an ImpDef,
which could potentially allow a comparison to be hoisted above the call
to `__chkstk`. In such a case, the CPSR could be clobbered, and the
check invalidated. When the support was initially added, it seemed that
the call would take care of preventing CPSR from being clobbered, but
this is not the case. Mark the register as clobbered to fix a possible
state corruption.
llvm-svn: 311061
We used to have a separate multiclass for AVX2 and SSE/AVX. Now we have one multiclass and pass the relevant differences.
We were also missing load patterns, though we had them for the AVX-512 version.
llvm-svn: 311059
Summary:
This patch teaches ADCE to preserve both DominatorTrees and PostDominatorTrees.
I didn't notice any performance impact when bootstrapping clang with this patch.
The patch was originally committed in r311039 and reverted in r311049.
This revision fixes the problem with not adding a dependency on the
DominatorTreeWrapperPass for the LegacyPassManager.
Reviewers: dberlin, chandlerc, sanjoy, davide, grosser, brzycki
Reviewed By: davide
Subscribers: grandinj, zhendongsu, llvm-commits, david2050
Differential Revision: https://reviews.llvm.org/D35869
llvm-svn: 311057
We had a lock to guard BAlloc from being used concurrently, but that
is not very easy to understand. This patch replaces it with a
std::unique_ptr.
llvm-svn: 311056
In dependent contexts we end up referencing these, so make sure they
have USRs, and have their declarations indexed. For the most part they
behave like typedefs, but we also need to worry about having multiple
using declarations with the same "name".
rdar://problem/33883650
llvm-svn: 311053
This way we can see what the current codegen looks like.
I've also explicitly added/removed the cmov attribute from the RUN lines,
so we know exactly what we're checking in the runs.
llvm-svn: 311052
The ASan runtime on many systems intercepts cxa_throw just so it
can call asan_handle_no_return first. Some newer systems such as
Fuchsia don't use interceptors on standard library functions at all,
but instead use sanitizer-instrumented versions of the standard
libraries. When libc++abi is built with ASan, cxa_throw can just
call asan_handle_no_return itself so no interceptor is required.
Patch by Roland McGrath
Differential Revision: https://reviews.llvm.org/D36599
llvm-svn: 311045
-no-integrated-as is not supported on some targets (e.g.,
x86_64-pc-windows-msvc). Testing using -save-temps is good enough to cover the
relevant logic, and that should work everywhere.
llvm-svn: 311043
Using Output.getFilename() to construct the file name used for optimization
recording in Clang::ConstructJob, when -c is provided, does not work correctly
if we're not using the integrated assembler. With -no-integrated-as (or
-save-temps) Output.getFilename() gives the name of the temporary assembly
file, not the final output file. Instead, use the final output (as provided by
-o). If this is not available, then fall back to using a name based on the
input file.
Fixes PR31532.
llvm-svn: 311041
Summary:
This patch teaches ADCE to preserve both DominatorTrees and PostDominatorTrees.
I didn't notice any performance impact when bootstrapping clang with this patch.
Reviewers: dberlin, chandlerc, sanjoy, davide, grosser, brzycki
Reviewed By: davide
Subscribers: grandinj, zhendongsu, llvm-commits, david2050
Differential Revision: https://reviews.llvm.org/D35869
llvm-svn: 311039