The patch fixes Bug 25759 produced by inappropriate handling of unsigned
maximum SCEV expressions by SCEVRemoveMax. Without a fix, we get an infinite
loop and a segmentation fault, if we try to process, for example,
'((-1 + (-1 * %b1)) umax {(-1 + (-1 * %yStart)),+,-1}<%.preheader>)'.
It also fixes a potential issue related to signed maximum SCEV expressions.
Tested-by: Roman Gareev <gareevroman@gmail.com>
Fixed-by: Tobias Grosser <tobias@grosser.es>
Differential Revision: http://reviews.llvm.org/D15563
llvm-svn: 255922
This sets the maximum entry count among all functions in the program to the module using module flags. This allows the optimizer to use this information.
Differential Revision: http://reviews.llvm.org/D15163
llvm-svn: 255918
The rules for removing trivially dead stores are a lot less complicated than loads. Since we know the later store post dominates the former and the former dominates the later, unless the former has side effects other than the actual store, we can remove it. One slightly surprising thing is that we can freely remove atomic stores, even if the later one isn't atomic. There's no guarantee the atomic one was every visible.
For the moment, we don't handle DSE of ordered atomic stores. We could extend the same chain of reasoning to them, but the catch is we'd then have to model the ordering effect without a store instruction. Since our fences are a stronger than our operation orderings, simple using a fence isn't an obvious win. This arguable calls for a refinement in our fence specification, but that's (much) later work.
Differential Revision: http://reviews.llvm.org/D15352
llvm-svn: 255914
C++ emits vtables for classes that have key function present in the
current TU. While we compile CUDA the fact that key function was found
in this TU does not mean that we are going to generate code for it. E.g.
vtable for a class with host-only methods should not (and can not) be
generated on device side, because we'll never generate code for them
during device-side compilation.
This patch adds an extra CUDA-specific check during key method computation
and filters out potential key methods that are not suitable for this side
of CUDA compilation.
When we codegen vtable, entries for unsuitable methods are set to null.
Differential Revision: http://reviews.llvm.org/D15309
llvm-svn: 255911
Summary:
Second patch split out from http://reviews.llvm.org/D14752.
Maps metadata as a post-pass from each module when importing complete,
suturing up final metadata to the temporary metadata left on the
imported instructions.
This entails saving the mapping from bitcode value id to temporary
metadata in the importing pass, and from bitcode value id to final
metadata during the metadata linking postpass.
Depends on D14825.
Reviewers: dexonsmith, joker.eph
Subscribers: davidxl, llvm-commits, joker.eph
Differential Revision: http://reviews.llvm.org/D14838
llvm-svn: 255909
Summary:
The method insertNOPs expected the number of wait states to be passed as
parameter, while eliminateFrameIndex passed the immediate argument for the
S_NOP, leading to an off-by-one error. Rename the method to make the
meaning of its parameter clearer. The number of 4 / 5 wait states (which
is what the method has always _tried_ to do according to the comment) is
correct according to the hardware docs.
I stumbled upon this while trying to track down the cause of
https://bugs.freedesktop.org/show_bug.cgi?id=93264. While clearly needed,
this patch unfortunately does not fix that bug...
Reviewers: arsenm, tstellarAMD
Subscribers: arsenm, llvm-commits
Differential Revision: http://reviews.llvm.org/D15542
llvm-svn: 255906
Clang has better diagnostics in this case. It is not necessary therefore
to change the destructor to avoid what is effectively an invalid warning
in gcc. Instead, better handle the warning flags given to the compiler.
llvm-svn: 255905
Currently we can just inspect the details of the most common allocation types.
This patch allows us to support all the types defined by the RS runtime in its `RsDataType` enum.
Including handlers, matrices and packed graphical data.
llvm-svn: 255904
These days relocations are created and stored in a deterministic way.
The order they are created is also suitable for the .o file, so we don't
need an explicit sort.
The last remaining exception is MIPS.
llvm-svn: 255902
This patch enables PostRAScheduler specifically for AArch64 generic build,
which is beneficial from the performance perspective.
Speedups up to 2 to 7% for some benchmarks on A57 and A53 are observed.
Also benchmarks from LLVM test-suite did not regress.
Differential Revision: http://reviews.llvm.org/D15557
llvm-svn: 255896
This patch adds a DAG combine for (any_extend (extract_vector_elt v, i)) ->
(extract_vector_elt v, i). The combine enables us to better match some SMOV
patterns.
Differential Revision: http://reviews.llvm.org/D15515
llvm-svn: 255895
Some distributions of python have their version defined as follows in patchlevel.h (note the '+'):
#define PY_VERSION "2.7.9+"
The '+' char needs to be stripped by the cmake regex so that LLDBs python lib detection is successful.
Differential Revision: http://reviews.llvm.org/D15566
llvm-svn: 255893
When running 'clang -O3 -mllvm -polly -mllvm -polly-show' we now only show the
CFGs of functions with at least one detected scop. For larger files/projects
this reduces the number of graphs printed significantly and is likely what
developers want to see. The new option -polly-view-all enforces all graphs to be
printed and the exiting option -poll-view-only limites the graph printing to
functions that match a certain pattern.
This patch requires https://llvm.org/svn/llvm-project/llvm/trunk@255889 (and
vice versa) to compile correctly.
llvm-svn: 255891
Add MS inline asm support for structs that contain fields that are also structs.
Differential Revision: http://reviews.llvm.org/D15578
llvm-svn: 255890
The method processFunction() is called to decide if a graph should be shown for
a certain function. To allow DOTGraphTraitViewers to take this decision based
on the analysis results for the given function, we forward a reference to the
analysis result. This will be used by Polly to only visualize functions where
interesting loop regions have been detected.
llvm-svn: 255889
This patch adds support for printing global static const variables which are given a DW_AT_const_value DWARF tag by clang.
Fix for bug https://llvm.org/bugs/show_bug.cgi?id=25653
Reviewers: clayborg, tberghammer
Subscribers: emaste, lldb-commits
Differential Revision: http://reviews.llvm.org/D15576
llvm-svn: 255887
Summary:
clang-modernize transforms have moved to clang-tidy. Removing
the old tool now.
Reviewers: klimek
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D15606
llvm-svn: 255886
@indntpoff is similar to @gotntpoff, but for use in position dependent code. While @gotntpoff resolves to GOT slot address relative to the
start of the GOT in the movl or addl instructions, @indntpoff resolves to the
absolute GOT slot address. ("ELF Handling For Thread-Local Storage", Ulrich Drepper).
Differential revision: http://reviews.llvm.org/D15494
llvm-svn: 255884
Summary:
As we override the indent option of the LLVM style, we need to override the access modifier
offset as well. Otherwise, classes will be formatted like such
class A
{
public:
int foo;
};
which is not used anywhere in LLDB. This option makes clang-format style more similar to LLDB and
brings it closer to the original intention of LLVM style, which was to not indent access
modifiers.
Reviewers: zturner, tfiala
Subscribers: lldb-commits
Differential Revision: http://reviews.llvm.org/D15562
llvm-svn: 255882
Add option to enable/disable LEA optimization pass. By default the pass is disabled.
Differential Revision: http://reviews.llvm.org/D15573
llvm-svn: 255881
We've now seen the rerun test phase hang in a few
scenarios. Eliminate the serial test runner (which
is not exercised nearly as much as the others), by
using a multi-worker test runner strategy with a single
worker. This should rule out whether this is related
to the serial test runner strategy.
llvm-svn: 255880