They are not in sync now, for example Bitcast would show up as LLVMCall.
So instead introduce 2 functions that map to and from the opcodes in the C
bindings.
llvm-svn: 141290
merging an lsl #2 that has multiple uses on A9. This shift is free, so there is
no problem merging it in multiple places. Other unprofitable shifts will not be
merged.
llvm-svn: 141247
For consistency, prefix multiclass template arg names with the
multiclass name followed by "::" to avoid name clashes among
multiclass arguments and other entities in the multiclass.
llvm-svn: 141239
Process each multidef declared in a multiclass. Iterate through the
list and instantiate a def in the multiclass for each item, resolving
the list item to the temporary iterator (possibly) used in the
multidef ObjectBody. We then process each generated def in the normal
way.
llvm-svn: 141233
Add parser support to recognize multidefs. No processing on the
multidef is done at this point. The grammar is:
MultiDef = MULTIDEF ObjectName < Value, Declaration, Value > ObjectBody
The first Value must be resolveable to a list and the second Value
must be resolveable to an integer. The Declaration is a temporary
value used as an iterator to refer to list items during processing.
It may be passed into the ObjectBody where it will be substituted with
the list value used to instantiate each def.
llvm-svn: 141232
Move the code to instantiate a multiclass def, bind its arguments and
resolve its members into three helper functions. These will be reused
to support a new kind of multiclass def: a multidef.
llvm-svn: 141229
While I'm here, fix the related issue with strncmp, add some actual tests for strcmp and strncmp, and start using StringRef::compare for constant folding instead of using strcmp/strncmp so that the optimized IR isn't dependent on the host's implementation of strcmp.
llvm-svn: 141227
PhysReg operands are not allowed to have sub-register indices at all.
For virtual registers with sub-reg indices, check that all registers in
the register class support the sub-reg index.
llvm-svn: 141220
Just pull the instruction name, but don't change the order of anything
else. That keeps --debug happy and non-crashing, but doesn't change
how the worklist gets built.
llvm-svn: 141210
EXTRACT_SUBREG is emitted as %dst = COPY %src:sub, so there is no need to
constrain the %dst register class. RegisterCoalescer will apply the
necessary constraints if it decides to eliminate the COPY.
The %src register class does need to be constrained to something with
the right sub-registers, though. This is currently done manually with
COPY_TO_REGCLASS nodes. They can possibly be removed after this patch.
llvm-svn: 141207
There are fewer registers with sub_8bit sub-registers in 32-bit mode
than in 64-bit mode. In 32-bit mode, sub_8bit behaves the same as
sub_8bit_hi.
llvm-svn: 141206
When updating the worklist for InstCombine, the Add/AddUsersToWorklist
functions may access the instruction(s) being added, for debug output for
example. If the instructions aren't yet added to the basic block, this
can result in a crash. Finish the instruction transformation before
adjusting the worklist instead.
rdar://10238555
llvm-svn: 141203
The register class created by INSERT_SUBREG and SUBREG_TO_REG must be
legal and support the SubIdx sub-registers.
The new getSubClassWithSubReg() hook can compute that.
This may create INSERT_SUBREG instructions defining a larger register
class than the sub-register being inserted. That is OK,
RegisterCoalescer will constrain the register class as needed when it
eliminates the INSERT_SUBREG instructions.
llvm-svn: 141198
TwoAddressInstructionPass should annotate instructions with <undef>
flags when it lower REG_SEQUENCE instructions. LiveIntervals should not
be in the business of modifying code (except for kill flags, perhaps).
llvm-svn: 141187
branch "br i1 %x, label %if_true, label %if_false" then it replaces
"%x" with "true" in places only reachable via the %if_true arm, and
with "false" in places only reachable via the %if_false arm. Except
that actually it doesn't: if value numbering shows that %y is equal
to %x then, yes, %y will be turned into true/false in this way, but
any occurrences of %x itself are not transformed. Fix this. What's
more, it's often the case that %x is an equality comparison such as
"%x = icmp eq %A, 0", in which case every occurrence of %A that is
only reachable via the %if_true arm can be replaced with 0. Implement
this and a few other variations on this theme. This reduces the number
of lines of LLVM IR in "GCC as one big file" by 0.2%. It has a bigger
impact on Ada code, typically reducing the number of lines of bitcode
by around 0.4% by removing repeated compiler generated checks. Passes
the LLVM nightly testsuite and the Ada ACATS testsuite.
llvm-svn: 141177
it's OK for the false/true destination to have multiple
predecessors as long as the extra ones are dominated by
the branch destination.
llvm-svn: 141176
I noticed during self-review that my previous checkin disabled some
analysis. Even with the reenabled analysis the test case runs in about
5ms. Without the fix, it will take several minutes at least.
llvm-svn: 141164
Note to compiler writers: never recurse on multiple instruction
operands without memoization.
Fixes rdar://10187945. Was taking 45s, now taking 5ms.
llvm-svn: 141161
This is a first pass at generating the jump table for the sjlj dispatch. It
currently generates something plausible, but hasn't been tested thoroughly.
llvm-svn: 141140
For example:
%vreg10:dsub_0<def,undef> = COPY %vreg1
%vreg10:dsub_1<def> = COPY %vreg2
is rewritten as:
%D2<def> = COPY %D0, %Q1<imp-def>
%D3<def> = COPY %D1, %Q1<imp-use,kill>, %Q1<imp-def>
The first COPY doesn't care about the previous value of %Q1, so it
doesn't read that register.
The second COPY is a partial redefinition of %Q1, so it implicitly kills
and redefines that register.
This makes it possible to recognize instructions that can harmlessly
clobber the full super-register. The write and don't read the
super-register.
llvm-svn: 141139
RegisterCoalescer can create sub-register defs when it is joining a
register with a sub-register. Add <undef> flags to these new
sub-register defs where appropriate.
llvm-svn: 141138
using llvm's public 'C' disassembler API now including annotations.
Hooked this up to Darwin's otool(1) so it can again print things like branch
targets for example this:
blx _puts
instead of this:
blx #-36
and includes support for annotations for branches to symbol stubs like:
bl 0x40 @ symbol stub for: _puts
and annotations for pc relative loads like this:
ldr r3, #8 @ literal pool for: Hello, world!
Also again can print the expression encoded in the Mach-O relocation entries for
things like this:
movt r0, :upper16:((_foo-_bar)+1234)
llvm-svn: 141129
The <undef> flag says that a MachineOperand doesn't read its register,
or doesn't depend on the previous value of its register.
A full register def never depends on the previous register value. A
partial register def may depend on the previous value if it is intended
to update part of a register.
For example:
%vreg10:dsub_0<def,undef> = COPY %vreg1
%vreg10:dsub_1<def> = COPY %vreg2
The first copy instruction defines the full %vreg10 register with the
bits not covered by dsub_0 defined as <undef>. It is not considered a
read of %vreg10.
The second copy modifies part of %vreg10 while preserving the rest. It
has an implicit read of %vreg10.
This patch adds a MachineOperand::readsReg() method to determine if an
operand reads its register.
Previously, this was modelled by adding a full-register <imp-def>
operand to the instruction. This approach makes it possible to
determine directly from a MachineOperand if it reads its register. No
scanning of MI operands is required.
llvm-svn: 141124
When resolving an operator list element reference, resolve all
operator operands and try to fold the operator first. This allows the
operator to collapse to a list which may then be indexed.
Before, it was not possible to do this:
class D<int a, int b> { ... }
class C<list<int> A> : D<A[0], A[1]>;
class B<list<int> b> : C<!foreach(...,b)>;
Now it is.
llvm-svn: 141101
This patch adds a preprocessor that can expand nested for-loops for
saving some copy-n-paste in *.td files.
The preprocessor is not yet integrated with TGParser, and so it has
no direct effect on *.td inputs. However, you may preprocess an td
input (and only preprocess it).
To test the proprecessor, type:
tblgen -E -o $@ $<
llvm-svn: 141079
This handles the case in which LSR rewrites an IV user that is a phi and
splits critical edges originating from a switch.
Fixes <rdar://problem/6453893> LSR is not splitting edges "nicely"
llvm-svn: 141059
This code will replace the version in ARMAsmPrinter.cpp. It creates a new
machine basic block, which is the dispatch for the return from a longjmp
call. It then shoves the address of that machine basic block into the correct
place in the function context so that the EH runtime will jump to it directly
instead of having to go through a compare-and-jump-to-the-dispatch bit. This
should be more efficient in the common case.
llvm-svn: 141031
It's documented as a separate instruction to line up with the Thumb1
encodings, for which it really is a distinct instruction encoding.
llvm-svn: 141020
* Add a couple of Create methods to the ARMConstantPoolConstant class,
* Add its own version of getExistingMachineCPValue, and
* Modify hasSameValue to return false if the object isn't an ARMConstantPoolConstant.
llvm-svn: 140935
We want heuristics to be based on accurate data, but more importantly
we don't want llvm to behave randomly. A benign trunc inserted by an
upstream pass should not cause a wild swings in optimization
level. See PR11034. It's a general problem with threshold-based
heuristics, but we can make it less bad.
llvm-svn: 140919
and the alignment is 0 (i.e., it's defined globally in one file and declared in
another file) it could get an alignment which is larger than the ABI allows for
that type, resulting in aligned moves being used for unaligned loads.
For instance, in file A.c:
struct S s;
In file B.c:
struct {
// something long
};
extern S s;
void foo() {
struct S p = s;
// ...
}
this copy is a 'memcpy' which is turned into a series of 'movaps' instructions
on X86. But this is wrong, because 'struct S' has alignment of 4, not 16.
llvm-svn: 140902
This uses less memory and it reduces the complexity of sub-class
operations:
- hasSubClassEq() and friends become O(1) instead of O(N).
- getCommonSubClass() becomes O(N) instead of O(N^2).
In the future, TableGen will infer register classes. This makes it
cheap to add them.
llvm-svn: 140898
Remove an assert that was expecting only the relevant 16bit portion for
the fixup being handled. Also kill some dead code in the T2 portion.
rdar://9653509
llvm-svn: 140861
catch or repeated filter clauses. Teach instcombine a bunch
of tricks for simplifying landingpad clauses. Currently the
code only recognizes the GNU C++ and Ada personality functions,
but that doesn't stop it doing a bunch of "generic" transforms
which are hopefully fine for any real-world personality function.
If these "generic" transforms turn out not to be generic, they
can always be conditioned on the personality function. Probably
someone should add the ObjC++ personality function. I didn't as
I don't know anything about it.
llvm-svn: 140852
This helps with porting code from 2.9 to 3.0 as TargetSelect.h changed location,
and if you include the old one by accident you will trigger this assert.
llvm-svn: 140848
Encode the immediate into its 8-bit form as part of isel rather than later,
which simplifies things for mapping the encoding bits, allows the removal
of the custom disassembler decoding hook, makes the operand printer trivial,
and prepares things more cleanly for handling these in the asm parser.
rdar://10211428
llvm-svn: 140834
Rewriting the entire loop nest now requires -enable-lsr-nested.
See PR11035 for some performance data.
A few unit tests specifically test nested LSR, and are now under a flag.
llvm-svn: 140762
The function needs to scan the implicit operands anyway, so no
performance is won by caching the number of implicit operands added to
an instruction.
This also fixes a bug when adding operands after an implicit operand has
been added manually. The NumImplicitOps count wasn't kept up to date.
MachineInstr::addOperand() will now consistently place all explicit
operands before all the implicit operands, regardless of the order they
are added. It is possible to change an MI opcode and add additional
explicit operands. They will be inserted before any existing implicit
operands.
The only exception is inline asm instructions where operands are never
reordered. This is because of a hack that marks explicit clobber regs
on inline asm as <implicit-def> to please the fast register allocator.
This hack can go away when InstrEmitter and FastIsel can add exact
<dead> flags to physreg defs.
llvm-svn: 140744