not legal. However, it should use a div instruction + mul + sub if divide is
legal. The rem legalization code was missing a check and incorrectly uses a
divrem libcall even when div is legal.
rdar://12481395
llvm-svn: 165778
The intent of this document is to be the go-to document for anybody who
wants to write new documentation, but isn't familiar with Sphinx.
llvm-svn: 165775
* Fix confusing explanation regarding abstract classes.
* Clarify auto-upcasting and why `Shape` doesn't need a `classof()`.
* Add section `Rules of Thumb` with some quick summary tips.
llvm-svn: 165768
Recent changes to isa<>/dyn_cast<> have made unnecessary those classof()
of the form:
static bool classof(const Foo *) { return true; }
Accordingly, remove mention of such classof() from the documentation.
llvm-svn: 165766
Additionally, all such cases are handled with no dynamic check.
All `classof()` of the form
class Foo {
[...]
static bool classof(const Bar *) { return true; }
[...]
}
where Foo is an ancestor of Bar are no longer necessary.
Don't write them!
Note: The exact test is `is_base_of<Foo, Bar>`, which is non-strict, so
that Foo is considered an ancestor of itself.
This leads to the following rule of thumb for LLVM-style RTTI:
The argument type of `classof()` should be a strict ancestor.
For more information about implementing LLVM-style RTTI, see
docs/HowToSetUpLLVMStyleRTTI.rst
llvm-svn: 165765
This classof() is effectively saying that a MachineCodeEmitter "is-a"
JITEmitter, but JITEmitter is in fact a descendant of
MachineCodeEmitter, so this is not semantically correct. Consequently,
none of the assertions that rely on these classof() actualy check
anything.
Remove the RTTI (which didn't actually check anything) and use
static_cast<> instead.
Post-Mortem Bug Analysis
========================
Cause of the bug
----------------
r55022 appears to be the source of the classof() and assertions removed
by this commit. It aimed at removing some dynamic_cast<> that were
solely in the assertions. A typical diff hunk from that commit looked
like:
- assert(dynamic_cast<JITEmitter*>(MCE) && "Unexpected MCE?");
- JITEmitter *JE = static_cast<JITEmitter*>(getCodeEmitter());
+ assert(isa<JITEmitter>(MCE) && "Unexpected MCE?");
+ JITEmitter *JE = cast<JITEmitter>(getCodeEmitter());
Hence, the source of the bug then seems to be an attempt to replace
dynamic_cast<> with LLVM-style RTTI without properly setting up the
class hierarchy for LLVM-style RTTI. The bug therefore appears to be
simply a "thinko".
What initially indicated the presence of the bug
------------------------------------------------
After implementing automatic upcasting for isa<>, classof() functions of
the form
static bool classof(const Foo *) { return true; }
were removed, since they only serve the purpose of optimizing
statically-OK upcasts. A subsequent recompilation triggered a build
failure on the isa<> tests within the removed asserts, since the
automatic upcasting (correctly) failed to substitute this classof().
Key to pinning down the root cause of the bug
---------------------------------------------
After being alerted to the presence of the bug, some thought about the
semantics which were being asserted by the buggy classof() revealed that
it was incorrect.
How the bug could have been prevented
-------------------------------------
This bug could have been prevented by better documentation for how to
set up LLVM-style RTTI. This should be solved by the recently added
documentation HowToSetUpLLVMStyleRTTI. However, this bug suggests that
the documentation should clearly explain the contract that classof()
must fulfill. The HowToSetUpLLVMStyleRTTI already explains this
contract, but it is a little tucked away. A future patch will expand
that explanation and make it more prominent.
There does not appear to be a simple way to have the compiler prevent
this bug, since fundamentally it boiled down to a spurious classof()
where the programmer made an erroneous statement about the conversion.
This suggests that perhaps the interface to LLVM-style RTTI of classof()
is not the best. There is already some evidence for this, since in a
number of places Clang has classof() forward to classofKind(Kind K)
which evaluates the cast in terms of just the Kind. This could probably
be generalized to simply a `static const Kind MyKind;` field in leaf
classes and `static const Kind firstMyKind, lastMyKind;` for non-leaf
classes, and have the rest of the work be done inside Casting.h,
assuming that the Kind enum is laid out in a preorder traversal of the
inheritance tree.
llvm-svn: 165764
When all cases of a switch statement are dead, the weights vector only has one
element, and we will get an ssertion failure when calling createBranchWeights.
llvm-svn: 165759
to the instruction position. The old encoding would give an absolute
ID which counts up within a function, and only resets at the next function.
I.e., Instead of having:
... = icmp eq i32 n-1, n-2
br i1 ..., label %bb1, label %bb2
it will now be roughly:
... = icmp eq i32 1, 2
br i1 1, label %bb1, label %bb2
This makes it so that ids remain relatively small and can be encoded
in fewer bits.
With this encoding, forward reference operands will be given
negative-valued IDs. Use signed VBRs for the most common case
of forward references, which is phi instructions.
To retain backward compatibility we bump the bitcode version
from 0 to 1 to distinguish between the different encodings.
llvm-svn: 165739
Not all instructions define a virtual register in their first operand.
Specifically, INLINEASM has a different format.
<rdar://problem/12472811>
llvm-svn: 165721
For function calls on the 64-bit PowerPC SVR4 target, each parameter
is mapped to as many doublewords in the parameter save area as
necessary to hold the parameter. The first 13 non-varargs
floating-point values are passed in registers; any additional
floating-point parameters are passed in the parameter save area. A
single-precision floating-point parameter (32 bits) must be mapped to
the second (rightmost, low-order) word of its assigned doubleword
slot.
Currently LLVM violates this ABI requirement by mapping such a
parameter to the first (leftmost, high-order) word of its assigned
doubleword slot. This is internally self-consistent but will not
interoperate correctly with libraries compiled with an ABI-compliant
compiler.
This patch corrects the problem by adjusting the parameter addressing
on both sides of the calling convention.
llvm-svn: 165714
Note: [D]M{T,F}CP2 is just a recommended encoding. Vendors often provide a
custom CP2 that interprets instructions differently and may wish to add their
own instructions that use this opcode. We should ensure that this is easy to
do. I will probably add a 'has custom CP{0-3}' subtarget flag to make this
easy: We want to avoid the GCC situation where every MIPS vendor makes a custom
fork that breaks every other MIPS CPU and so can't be merged upstream.
llvm-svn: 165711
Patch from Preston Briggs <preston.briggs@gmail.com>.
This is an updated version of the dependence-analysis patch, including an MIV
test based on Banerjee's inequalities.
It's a fairly complete implementation of the paper
Practical Dependence Testing
Gina Goff, Ken Kennedy, and Chau-Wen Tseng
PLDI 1991
It cannot yet propagate constraints between coupled RDIV subscripts (discussed
in Section 5.3.2 of the paper).
It's organized as a FunctionPass with a single entry point that supports testing
for dependence between two instructions in a function. If there's no dependence,
it returns null. If there's a dependence, it returns a pointer to a Dependence
which can be queried about details (what kind of dependence, is it loop
independent, direction and distance vector entries, etc). I haven't included
every imaginable feature, but there's a good selection that should be adequate
for supporting many loop transformations. Of course, it can be extended as
necessary.
Included in the patch file are many test cases, commented with C code showing
the loops and array references.
llvm-svn: 165708
value but later turns out to be a function.
Unfortunately, we can't fold tests into a single file because we only get one
error out of llvm-as.
llvm-svn: 165680
Original message:
The attached is the fix to radar://11663049. The optimization can be outlined by following rules:
(select (x != c), e, c) -> select (x != c), e, x),
(select (x == c), c, e) -> select (x == c), x, e)
where the <c> is an integer constant.
The reason for this change is that : on x86, conditional-move-from-constant needs two instructions;
however, conditional-move-from-register need only one instruction.
While the LowerSELECT() sounds to be the most convenient place for this optimization, it turns out to be a bad place. The reason is that by replacing the constant <c> with a symbolic value, it obscure some instruction-combining opportunities which would otherwise be very easy to spot. For that reason, I have to postpone the change to last instruction-combining phase.
The change passes the test of "make check-all -C <build-root/test" and "make -C project/test-suite/SingleSource".
llvm-svn: 165661
the compiler makes use of GPR0. However, there are two flavors of
GPR0 defined by the target: the 32-bit GPR0 (R0) and the 64-bit GPR0
(X0). The spill/reload code makes use of R0 regardless of whether we
are generating 32- or 64-bit code.
This patch corrects the problem in the obvious manner, using X0 and
ADDI8 for 64-bit and R0 and ADDI for 32-bit.
llvm-svn: 165658
the Altivec extensions were introduced. Its use is optional, and
allows the compiler to communicate to the operating system which
vector registers should be saved and restored during a context switch.
In practice, this information is ignored by the various operating
systems using the SVR4 ABI; the kernel saves and restores the entire
register state. Setting the VRSAVE register is no longer performed by
the AIX XL compilers, the IBM i compilers, or by GCC on Power Linux
systems. It seems best to avoid this logic within LLVM as well.
This patch avoids generating code to update and restore VRSAVE for the
PowerPC SVR4 ABIs (32- and 64-bit). The code remains in place for the
Darwin ABI.
llvm-svn: 165656
The minimum set of required instructions is ISD::AND, ISD::OR, ISD::SETO(or ISD::SETOEQ) and ISD::SETUO(or ISD::SETUNE). Everything is expanded into one of two patterns:
Pattern 1: (LHS CC1 RHS) Opc (LHS CC2 RHS)
Pattern 2: (LHS CC1 LHS) Opc (RHS CC2 RHS)
llvm-svn: 165655
Some of these dyn_cast<>'s would be better phrased as isa<> or cast<>.
That will happen in a future patch.
There are also two dyn_cast_or_null<>'s slipped in instead of
dyn_cast<>'s, since they were causing crashes with just dyn_cast<>.
llvm-svn: 165646
Hypothesis 1: use of `.. code::` directive instead of `.. code-block::`
is causing Sphinx to discard the block. On my machine, `.. code::`
renders fine. However, I don't think that `.. code::` is actually a
legit Sphinx directive. I believe that on my machine Sphinx is falling
back to just displaying it monospace with no syntax, whereas llvm.org's
Sphinx is just discarding it.
This is truly "remote debugging" since I can't reproduce this on my
machine. It would be helpful to be able to see the llvm.org Sphinx
build logs; if that's possible please let me know.
llvm-svn: 165632
- Due to the current matching vector elements constraints in
ISD::FP_ROUND, rounding from v2f64 to v4f32 (after legalization from
v2f32) is scalarized. Add a customized v2f32 widening to convert it
into a target-specific X86ISD::VFPROUND to work around this
constraints.
llvm-svn: 165631
- Due to the current matching vector elements constraints in ISD::FP_EXTEND,
rounding from v2f32 to v2f64 is scalarized. Add a customized v2f32 widening
to convert it into a target-specific X86ISD::VFPEXT to work around this
constraints. This patch also reverts a previous attempt to fix this issue by
recovering the scalarized ISD::FP_EXTEND pattern and thus significantly
reduces the overhead of supporting non-power-2 vector FP extend.
llvm-svn: 165625
SDNode for LDRB_POST_IMM is invalid: number of registers added to SDNode fewer
that described in .td.
7 ops is needed, but SDNode with only 6 is created.
In more details:
In ARMInstrInfo.td, in multiclass AI2_ldridx, in definition _POST_IMM, offset
operand is defined as am2offset_imm. am2offset_imm is complex parameter type,
and actually it consists from dummy register and imm itself. As I understood
trick with dummy reg was made for AsmParser. In ARMISelLowering.cpp, this dummy
register was not added to SDNode, and it cause crash in Peephole Optimizer pass.
The problem fixed by setting up additional dummy reg when emitting
LDRB_POST_IMM instruction.
llvm-svn: 165617
SchedulerDAGInstrs::buildSchedGraph ignores dependencies between FixedStack
objects and byval parameters. So loading byval parameters from stack may be
inserted *before* it will be stored, since these operations are treated as
independent.
Fix:
Currently ARMTargetLowering::LowerFormalArguments saves byval registers with
FixedStack MachinePointerInfo. To fix the problem we need to store byval
registers with MachinePointerInfo referenced to first the "byval" parameter.
Also commit adds two new fields to the InputArg structure: Function's argument
index and InputArg's part offset in bytes relative to the start position of
Function's argument. E.g.: If function's argument is 128 bit width and it was
splitted onto 32 bit regs, then we got 4 InputArg structs with same arg index,
but different offset values.
llvm-svn: 165616
Allows the new machine model to be used for NumMicroOps and OutputLatency.
Allows the HazardRecognizer to be disabled along with itineraries.
llvm-svn: 165603
This wasn't contributing anything significant to postRA heuristics except compile time (by my measurements) and will be replaced by a more general heuristic for cross-region dependencies within the scheduler itself.
llvm-svn: 165563
This patch provides initial implementation of load address
macro instruction for Mips. We have implemented two kinds
of expansions with their variations depending on the size
of immediate operand:
1) load address with immediate value directly:
* la d,j => addiu d,$zero,j (for -32768 <= j <= 65535)
* la d,j => lui d,hi16(j)
ori d,d,lo16(j) (for any other 32 bit value of j)
2) load load address with register offset value
* la d,j(s) => addiu d,s,j (for -32768 <= j <= 65535)
* la d,j(s) => lui d,hi16(j) (for any other 32 bit value of j)
ori d,d,lo16(j)
addu d,d,s
This patch does not cover the case when the address is loaded
from the value of the label or function.
Contributer: Vladimir Medic
llvm-svn: 165561
DeadArgumentElimination pass can replace one LLVM function with another,
invalidating a pointer stored in debug info metadata entry for this function.
To fix this, we collect debug info descriptors for functions before
running a DeadArgumentElimination pass and "patch" pointers in metadata nodes
if we replace a function.
llvm-svn: 165490
We use the enums to query whether an Attributes object has that attribute. The
opaque layer is responsible for knowing where that specific attribute is stored.
llvm-svn: 165488
This class is used by LSR and a number of places in the codegen.
This is the first step in de-coupling LSR from TLI, and creating
a new interface in between them.
llvm-svn: 165455
When the CFG contains a loop with multiple entry blocks, the traces
computed by MachineTraceMetrics don't always have the same nice
properties. Loop back-edges are normally excluded from traces, but
MachineLoopInfo doesn't recognize loops with multiple entry blocks, so
those back-edges may be included.
Avoid asserting when that happens by adding an isEarlierInSameTrace()
function that accurately determines if a dominating block is part of the
same trace AND is above the currrent block in the trace.
llvm-svn: 165434
Start using the AttributesImpl object to hold the value of the attributes. All
queries go through the interfaces now.
This has one unfortunate consequence. I needed to move the AttributesImpl.h file
into include/llvm. But this is only temporary! Otherwise, the changes needed to
support this would be too large.
llvm-svn: 165433
Vector compare using altivec 'vcmpxxx' instructions have as third argument
a vector register instead of CR one, different from integer and float-point
compares. This leads to a failure in code generation, where 'SelectSETCC'
expects a DAG with a CR register and gets vector register instead.
This patch changes the behavior by just returning a DAG with the
vector compare instruction based on the type. The patch also adds a testcase
for all vector types llvm defines.
It also included a fix on signed 5-bits predicates printing, where
signed values were not handled correctly as signed (char are unsigned by
default for PowerPC). This generates 'vspltisw' (vector splat)
instruction with SIM out of range.
llvm-svn: 165419
Without this change, when the estimated cost for inlining a function with
an "alwaysinline" attribute was lower than the inlining threshold, the
getInlineCost function was returning that estimated cost rather than the
special InlineCost::AlwaysInlineCost value. That is fine in the normal
inlining case, but it can fail when the inliner considers the opportunity
cost of inlining into an internal or linkonce-odr function. It may decide
not to inline the always-inline function in that case. The fix here is just
to make getInlineCost always return the special value for always-inline
functions. I ran into this building clang with libc++. Tablegen failed to
link because of an always-inline function that was not inlined. I have been
unable to reduce the testcase down to a reasonable size.
llvm-svn: 165367
into separate versions for the Darwin and 64-bit SVR4 ABIs. This will
facilitate doing more major surgery on the 64-bit SVR4 ABI in the near future.
llvm-svn: 165336
- Substitute hyphen to underscore, s/-/_/g, as the variable name.
- Additional parameter can be specified as the name of directory.
e.g.) add_llvm_external_project(clang-tools-extra extra)
- LLVM_EXTERNAL_CLANG_TOOLS_EXTRA_SOURCE_DIR=/path/to/llvm-srcroot/tools/clang/tools/extra, by default.
- Build directory is in ${CMAKE_CURRENT_BINARY_DIR}/extra
llvm-svn: 165311
have an alloca or a parameter, since then the alloca test should make sense
to readers, while before it probably appears too specific. No functionality
change.
llvm-svn: 165306
The internal representation of the Attributes class will be opaque. All of the
query methods will need to query the opaque class. Therefore, these methods need
to be out-of-line.
No functionality change intended.
llvm-svn: 165305
This document describes how to set up LLVM-style RTTI for a class
hierarchy. Surprisingly, this was not previously documented.
Also, link it into ProgrammersManual.html.
llvm-svn: 165293
This is a mechanical change of dynamic_cast<> to dyn_cast<>. A number of
these uses are actually more like isa<> or cast<>, and will be changed
to the semanticaly appropriate one in a future patch.
llvm-svn: 165291
are in fact identity operations. We detect these and kill their
partitions so that even splitting is unaffected by them. This is
particularly important because Clang relies on emitting identity memcpy
operations for struct copies, and these fold away to constants very
often after inlining.
Fixes the last big performance FIXME I have on my plate.
llvm-svn: 165285
the rewrite visitor to make the fact that the speculation is completely
independent a bit more clear.
I promise that this is just a cut/paste of the one visitor and adding
the annonymous namespace wrappings. The diff may look completely
preposterous, it does in git for some reason.
llvm-svn: 165284
Make sure functions located in user specified text sections (via the
section attribute) are located together with the default text sections.
Otherwise, for large object files, the relocations for call instructions
are more likely to be out of range. This becomes even more likely in the
presence of LTO.
rdar://12402636
llvm-svn: 165254
a) frame setup instructions define the prologue
b) we shouldn't change our location mid-stream
Add a test to make sure that the stack adjustment stays within
the prologue.
llvm-svn: 165250
- Add 'HwEncoding' for X86 registers and call getEncodingValue() to
retrieve their encoding values.
- This's the first step to adopt new scheme. Furthur revising is onging.
llvm-svn: 165241
"Instruction 'foo' has no tokens" errors during llvm-tblgen
-gen-asm-matcher attempts. At this time, the added
tokens are "#comment" style rather than the actual mnemonic. This will
be revisited once the rest of the base asmparser bits get straightened
out for ppc64-elf-linux.
llvm-svn: 165237
We conservatively only check the first use to avoid walking long use chains.
This catches the common case of having both a load and a store to a pointer
supplied by a PHI node.
llvm-svn: 165232
cpyDest can be mutated in some cases, which would then cause a crash later if
indeed the memory was underaligned. This brought down several buildbots, so
I guess the underaligned case is much more common than I thought!
llvm-svn: 165228
Currently, we re-visit allocas when something changes about the way they
might be *split* to allow better scalarization to take place. However,
we weren't handling the case when the *promotion* is what would change
the behavior of SROA. When an address derived from an alloca is stored
into another alloca, we consider the first to have escaped. If the
second is ever promoted to an SSA value, we will suddenly be able to run
the SROA pass on the first alloca.
This patch adds explicit support for this form if iteration. When we
detect a store of a pointer derived from an alloca, we flag the
underlying alloca for reprocessing after promotion. The logic works hard
to only do this when there is definitely going to be promotion and it
might remove impediments to the analysis of the alloca.
Thanks to Nick for the great test case and Benjamin for some sanity
check review.
llvm-svn: 165223
was less aligned than the old. In the testcase this results in an overaligned
memset: the memset alignment was correct for the original memory but is too much
for the new memory. Fix this by either increasing the alignment of the new
memory or bailing out if that isn't possible. Should fix the gcc-4.7 self-host
buildbot failure.
llvm-svn: 165220
Sorry for this being broken so long. =/
As part of this, switch all of the existing tests to be Little Endian,
which is the behavior I was asserting in them anyways! Add in a new
big-endian test that checks the interesting behavior there.
Another part of this is to tighten the rules abotu when we perform the
full-integer promotion. This logic now rejects cases where there fully
promoted integer is a non-multiple-of-8 bitwidth or cases where the
loads or stores touch bits which are in the allocated space of the
alloca but are not loaded or stored when accessing the integer. Sadly,
these aren't really observable today as the rest of the pass will
already ensure the invariants hold. However, the latter situation is
likely to become a potential concern in the future.
Thanks to Benjamin and Duncan for early review of this patch. I'm still
looking into whether there are further endianness issues, please let me
know if anyone sees BE failures persisting past this.
llvm-svn: 165219