Summary:
This patch optimizes a binop sandwiched between 2 selects with the same condition. Since we know its only used by the select we can propagate the appropriate input value from the earlier select.
As I'm writing this I realize I may need to avoid doing this for division in case the select was protecting a divide by zero?
Reviewers: spatel, majnemer
Reviewed By: majnemer
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D39999
llvm-svn: 318267
break. The alignas(__uint128_t) is not recognized with MSVC
it looks like. Zachary, is there a similar type on windows?
I suppose I can go with alignas(16) here but I'd prefer to
specify the type alignment that I want & let the ABI dictate
how much padding is required.
llvm-svn: 318262
Summary:
This change fixes the XRay trampolines aside from the __xray_CustomEvent
trampoline to align the stack to 16-byte boundaries before calling the
handler. Before this change we've not been explicitly aligning the stack
to 16-byte boundaries, which makes it dangerous when calling handlers
that leave the stack in a state that isn't strictly 16-byte aligned
after calling the handlers.
We add a test that makes sure we can handle these cases appropriately
after the changes, and prevents us from regressing the state moving
forward.
Fixes http://llvm.org/PR35294.
Reviewers: pelikan, pcc
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D40004
llvm-svn: 318261
When we merge together class definitions, we can end up with the canonical
declaration of a field not being the one that was lexically within the
canonical definition of the class. Additionally, when we merge class
definitions via update records (eg, for a template specialization whose
declaration is instantiated in one module and whose definition is instantiated
in multiple others), we can end up with the list of lexical contents for the
class not including a particular declaration of a field whose lexical parent is
that class definition. In the worst case, we have a field whose canonical
declaration's lexical parent has no fields, and in that case this attempt to
number the fields by walking the fields in the declaration of the class that
contained one of the canonical fields will fail.
Instead, when numbering fields in a class, do the obvious thing: walk the
fields in the definition.
I'm still trying to reduce a testcase; the setup that leads to the above
scenario seems to be quite fragile.
llvm-svn: 318245
It crashes building sqlite; see reply on the llvm-commits thread.
> [SLPVectorizer] Failure to beneficially vectorize 'copyable' elements in integer binary ops.
>
> Patch tries to improve vectorization of the following code:
>
> void add1(int * __restrict dst, const int * __restrict src) {
> *dst++ = *src++;
> *dst++ = *src++ + 1;
> *dst++ = *src++ + 2;
> *dst++ = *src++ + 3;
> }
> Allows to vectorize even if the very first operation is not a binary add, but just a load.
>
> Fixed issues related to previous commit.
>
> Reviewers: spatel, mzolotukhin, mkuper, hfinkel, RKSimon, filcab, ABataev
>
> Reviewed By: ABataev, RKSimon
>
> Subscribers: llvm-commits, RKSimon
>
> Differential Revision: https://reviews.llvm.org/D28907
llvm-svn: 318239
Summary:
This patch adds another failure mode for `validateCFIProtection(..)`, wherein any register that affects the indirect control flow instruction is clobbered to between the CFI-check and the instruction's execution.
Also includes a modification to make MCInstrDesc::hasDefOfPhysReg public.
Reviewers: vlad.tsyrklevich
Reviewed By: vlad.tsyrklevich
Subscribers: llvm-commits, pcc, kcc
Differential Revision: https://reviews.llvm.org/D39820
llvm-svn: 318238
Simplifying a loop latch changes the IR and we need to make sure the pass manager knows to invalidate analysis passes if that happened.
PR35210 discovered a case where we failed to invalidate the post dominator tree after this simplification because we no changes other than simplifying the loop latch.
Fixes PR35210.
Differential Revision: https://reviews.llvm.org/D40035
llvm-svn: 318237
Summary:
In the mode when ASan shadow base is computed as the address of an
external global (__asan_shadow, currently on android/arm32 only),
regalloc prefers to rematerialize this value to save register spills.
Even in -Os. On arm32 it is rather expensive (2 loads + 1 constant
pool entry).
This changes adds an inline asm in the function prologue to suppress
this behavior. It reduces AsanTest binary size by 7%.
Reviewers: pcc, vitalybuka
Subscribers: aemerson, kristof.beyls, hiraditya, llvm-commits
Differential Revision: https://reviews.llvm.org/D40048
llvm-svn: 318235
LLVM exposes a file in the backend (X86TargetParser.def) that
contains information about the correct list of CpuIs values.
This patch removes 2 of the copied and pasted versions of this
list from clang and instead includes the data from the .def file.
Differential Revision: https://reviews.llvm.org/D40054
llvm-svn: 318234
Lifting from Bob Wilson's notes: The hash value that we compute and
store in PGO profile data to detect out-of-date profiles does not
include enough information. This means that many significant changes to
the source will not cause compiler warnings about the profile being out
of date, and worse, we may continue to use the outdated profile data to
make bad optimization decisions. There is some tension here because
some source changes won't affect PGO and we don't want to invalidate the
profile unnecessarily.
This patch adds a new hashing scheme which is more sensitive to loop
nesting, conditions, and out-of-order control flow. Here are examples
which show snippets which get the same hash under the current scheme,
and different hashes under the new scheme:
Loop Nesting Example
--------------------
// Snippet 1
while (foo()) {
while (bar()) {}
}
// Snippet 2
while (foo()) {}
while (bar()) {}
Condition Example
-----------------
// Snippet 1
if (foo())
bar();
baz();
// Snippet 2
if (foo())
bar();
else
baz();
Out-of-order Control Flow Example
---------------------------------
// Snippet 1
while (foo()) {
if (bar()) {}
baz();
}
// Snippet 2
while (foo()) {
if (bar())
continue;
baz();
}
In each of these cases, it's useful to differentiate between the
snippets because swapping their profiles gives bad optimization hints.
The new hashing scheme considers some logical operators in an effort to
detect more changes in conditions. This isn't a perfect scheme. E.g, it
does not produce the same hash for these equivalent snippets:
// Snippet 1
bool c = !a || b;
if (d && e) {}
// Snippet 2
bool f = d && e;
bool c = !a || b;
if (f) {}
This would require an expensive data flow analysis. Short of that, the
new hashing scheme looks reasonably complete, based on a scan over the
statements we place counters on.
Profiles which use the old version of the PGO hash remain valid and can
be used without issue (there are tests in tree which check this).
rdar://17068282
Differential Revision: https://reviews.llvm.org/D39446
llvm-svn: 318229
This is no longer needed for any of the runtimes build and it breaks
in case we don't have the working compiler yet, e.g. when building
a compiler that uses compiler-rt and libc++ as a default runtime,
because these common options check whether these are available.
Differential Revision: https://reviews.llvm.org/D39932
llvm-svn: 318227
Even when building builtins and runtimes for the default target
we shouldn't assume that the just built compiler is already useable.
When the compiler uses compiler-rt and libc++ as the default runtime
and C++ library, it won't be usable until we finish building runtimes.
Differential Revision: https://reviews.llvm.org/D39715
llvm-svn: 318224
The test was doing -stop-after=isel, but that pass is actually the
AMDGPUDAGToDAGISel pass, which might not be built when targeting x86_64.
This changes the test to -stop-after=expand-isel-pseudos instead.
Follow-up to r318202.
llvm-svn: 318220
On e.g. PPC the return value and argument were marked 'signext'. This
makes the test expectations a bit more flexible.
Follow-up to r318199.
llvm-svn: 318214
Allows users to view GraphResult objects in a DOT directed-graph format. This feature can be turned on through the --print-graphs flag.
Also enabled pretty-printing of instructions in output. Together these features make analysis of unprotected CF instructions much easier by providing a visual control flow graph.
Reviewers: pcc
Subscribers: llvm-commits, kcc, vlad.tsyrklevich
Differential Revision: https://reviews.llvm.org/D39819
llvm-svn: 318211
artifacts along with DCE
Legalization Artifacts are all those insts that are there to make the
type system happy. Currently, the target needs to say all combinations
of extends and truncs are legal and there's no way of verifying that
post legalization, we only have *truly* legal instructions. This patch
changes roughly the legalization algorithm to process all illegal insts
at one go, and then process all truncs/extends that were added to
satisfy the type constraints separately trying to combine trivial cases
until they converge. This has the added benefit that, the target
legalizerinfo can only say which truncs and extends are okay and the
artifact combiner would combine away other exts and truncs.
Updated legalization algorithm to roughly the following pseudo code.
WorkList Insts, Artifacts;
collect_all_insts_and_artifacts(Insts, Artifacts);
do {
for (Inst in Insts)
legalizeInstrStep(Inst, Insts, Artifacts);
for (Artifact in Artifacts)
tryCombineArtifact(Artifact, Insts, Artifacts);
} while(!Insts.empty());
Also, wrote a simple wrapper equivalent to SetVector, except for
erasing, it avoids moving all elements over by one and instead just
nulls them out.
llvm-svn: 318210
In addition to the current ON and OFF options, this adds the FORCE_ON
option, which causes a configuration error if libxml2 cannot be used.
Differential revision: https://reviews.llvm.org/D40050
llvm-svn: 318209
This adjusts the tests to hopfully pacify the
llvm-clang-x86_64-expensive-checks-win buildbot.
Unlike many other instructions, these instructions have aliases which
take coprocessor registers, gpr register, accumulator (and dsp accumulator)
registers, floating point registers, floating point control registers and
coprocessor 2 data and control operands.
For the moment, these aliases are treated as pseudo instructions which are
expanded into the underlying instruction. As a result, disassembling these
instructions shows the underlying instruction and not the alias.
Reviewers: slthakur, atanasyan
Differential Revision: https://reviews.llvm.org/D35253
llvm-svn: 318207