Commit Graph

60645 Commits

Author SHA1 Message Date
Nadav Rotem a1e5e44eb3 Rename the slp-vectorizer clang/llvm flags. No functionality change.
llvm-svn: 179505
2013-04-15 04:54:42 +00:00
Nadav Rotem 5d393c416f SLPVectorizer: Add support for vectorizing trees that start at compare instructions.
llvm-svn: 179504
2013-04-15 04:25:27 +00:00
Hal Finkel 95e6ea69be Mark all PPC comparison instructions as not having side effects
Now that the CR spilling issues have been resolved, we can remove the
unmodeled-side-effect attributes from the comparison instructions (and also
mark them as isCompare). By allowing these, by default, to have unmodeled side
effects, we were hiding problems with CR spilling; but everything seems much
happier now.

llvm-svn: 179502
2013-04-15 02:37:46 +00:00
Hal Finkel 6736988ae2 Fix PPC64 CR spill location for callee-saved registers
This fixes an ABI bug for non-Darwin PPC64. For the callee-saved condition
registers, the spill location is specified relative to the stack pointer (SP +
8). However, this is not relative to the SP after the new stack frame is
established, but instead relative to the caller's stack pointer (it is stored
into the linkage area of the parent's stack frame).

So, like with the link register, we don't directly spill the CRs with other
callee-saved registers, but just mark them to be spilled during prologue
generation.

In practice, this reverts r179457 for PPC64 (but leaves it in place for PPC32).

llvm-svn: 179500
2013-04-15 02:07:05 +00:00
Nico Rieck 334c7bc7eb Use object file specific section type for initial text section
llvm-svn: 179494
2013-04-14 21:18:36 +00:00
David Majnemer 1fae195557 Reorders two transforms that collide with each other
One performs: (X == 13 | X == 14) -> X-13 <u 2
The other: (A == C1 || A == C2) -> (A & ~(C1 ^ C2)) == C1

The problem is that there are certain values of C1 and C2 that
trigger both transforms but the first one blocks out the second,
this generates suboptimal code.

Reordering the transforms should be better in every case and
allows us to do interesting stuff like turn:
  %shr = lshr i32 %X, 4
  %and = and i32 %shr, 15
  %add = add i32 %and, -14
  %tobool = icmp ne i32 %add, 0

into:
  %and = and i32 %X, 240
  %tobool = icmp ne i32 %and, 224

llvm-svn: 179493
2013-04-14 21:15:43 +00:00
Benjamin Kramer 7d62ea86e5 Miscellaneous cleanups for VecUtils.h
llvm-svn: 179483
2013-04-14 09:33:08 +00:00
Nadav Rotem 3403c11529 SLP: Document the scalarization cost method.
llvm-svn: 179479
2013-04-14 07:22:22 +00:00
Nadav Rotem 0db0690a70 Document the decision to assume that the cost of floats is twice as much as integers.
llvm-svn: 179478
2013-04-14 05:55:18 +00:00
Jakob Stoklund Olesen eed1072ff8 Use i32 for all SPARC shift amounts, even in 64-bit mode.
Test case by llvm-stress.

llvm-svn: 179477
2013-04-14 05:48:50 +00:00
Nadav Rotem 54b413d157 SLPVectorizer: Add support for trees that don't start at binary operators, and add the cost of extracting values from the roots of the tree.
llvm-svn: 179475
2013-04-14 05:15:53 +00:00
Jakob Stoklund Olesen c3c28f8599 Add support for the abs64 SPARC v9 code model.
For when 16 TB just isn't enough.

llvm-svn: 179474
2013-04-14 05:10:36 +00:00
Jakob Stoklund Olesen c8fc76b078 Add support for the SPARC v9 abs44 code model.
This is the default model for non-PIC 64-bit code. It supports
text+data+bss linked anywhere in the low 16 TB of the address space.

llvm-svn: 179473
2013-04-14 04:57:51 +00:00
Jakob Stoklund Olesen 2e64d7ab1d Use target flags for printing SPARC asm operands.
64-bit code models need multiple relocations that can't be inferred from
the opcode like they can in 32-bit code.

llvm-svn: 179472
2013-04-14 04:35:19 +00:00
Jakob Stoklund Olesen e0fc832b77 Also put target flags on SPARC constant pool references.
Constant pool entries are accessed exactly the same way as global
variables.

llvm-svn: 179471
2013-04-14 04:35:16 +00:00
Nadav Rotem 0b9cf8567b SLPVectorizer: add initial support for reduction variable vectorization.
llvm-svn: 179470
2013-04-14 03:22:20 +00:00
Jakob Stoklund Olesen dc1ed57858 Fix patterns for 64-bit pointers.
This fixes the pic32 code model for SPARC v9.

llvm-svn: 179469
2013-04-14 01:53:23 +00:00
Jakob Stoklund Olesen 1fb08a8b08 Add target flags to SPARC address operands.
SDNodes and MachineOperands get target flags representing the %hi() and
%lo() assembly annotations that eventually become relocations.

Also define flags to be used by the 64-bit code models.

llvm-svn: 179468
2013-04-14 01:33:32 +00:00
Hal Finkel 2f29391504 Mark all PPC CR registers to be spilled as live-in and tag MFCR appropriately
Leaving MFCR has having unmodeled side effects is not enough to prevent
unwanted instruction reordering post-RA. We could probably apply a stronger
barrier attribute, but there is a better way: Add all (not just the first) CR
to be spilled as live-in to the entry block, and add all CRs to the MFCR
instruction as implicitly killed.

Unfortunately, I don't have a small test case.

llvm-svn: 179465
2013-04-13 23:06:15 +00:00
Jakob Stoklund Olesen 15b3e90081 Define SPARC code models.
Currently, only abs32 and pic32 are implemented. Add a test case for
abs32 with 64-bit code. 64-bit PIC code is currently broken.

llvm-svn: 179463
2013-04-13 19:02:23 +00:00
Jakob Stoklund Olesen 6a0a3eb53e Use the correct types when matching ADDRri patterns from frame indexes.
It doesn't seem like anybody is checking types this late in isel, so no
test case.

llvm-svn: 179462
2013-04-13 19:02:16 +00:00
Benjamin Kramer adc1727c39 GlobalDCE: Fix an oversight in my last commit that could lead to crashes.
There is a Constant with non-constant operands: blockaddress.

llvm-svn: 179460
2013-04-13 16:11:14 +00:00
Benjamin Kramer 89ca4bc6d4 Fix a scalability issue with complex ConstantExprs.
This is basically the same fix in three different places. We use a set to avoid
walking the whole tree of a big ConstantExprs multiple times.

For example: (select cmp, (add big_expr 1), (add big_expr 2))
We don't want to visit big_expr twice here, it may consist of thousands of
nodes.

The testcase exercises this by creating an insanely large ConstantExprs out of
a loop. It's questionable if the optimizer should ever create those, but this
can be triggered with real C code. Fixes PR15714.

llvm-svn: 179458
2013-04-13 12:53:18 +00:00
Hal Finkel d85a04b3df Spill and restore PPC CR registers using the FP when we have one
For functions that need to spill CRs, and have dynamic stack allocations, the
value of the SP during the restore is not what it was during the save, and so
we need to use the FP in these cases (as for all of the other spills and
restores, but the CR restore has a special code path because its reserved slot,
like the link register, is specified directly relative to the adjusted SP).

llvm-svn: 179457
2013-04-13 08:09:20 +00:00
Andrew Trick 1f0bb69b6c MI-Sched: DEBUG formatting.
llvm-svn: 179452
2013-04-13 06:07:49 +00:00
Andrew Trick be2bccbce9 MI-Sched cleanup. If an instruction has no valid sched class, do not attempt to check for a variant.
llvm-svn: 179451
2013-04-13 06:07:45 +00:00
Andrew Trick f7fd6b9e3a X86 machine model: reduce SandyBridge and Haswell ILPWindow.
The initial values were arbitrary. I want them to be more
conservative. This represents the number of latency cycles hidden by
OOO execution. In practice, I think it should be within a small factor
of the complex floating point operation latency so the scheduler can
make some attempt to hide latency even for smallish blocks.

These are by no means the best values, just a starting point for
tuning heuristics. Some benchmarks such as TSVC run faster with this
lower value for SandyBridge. I haven't run anything on Haswell, but
it's shouldn't be 2x SB.

llvm-svn: 179450
2013-04-13 06:07:43 +00:00
Andrew Trick e833e1cd6e MI-Sched: schedule physreg copies.
The register allocator expects minimal physreg live ranges. Schedule
physreg copies accordingly. This is slightly tricky when they occur in
the middle of the scheduling region. For now, this is handled by
rescheduling the copy when its associated instruction is
scheduled. Eventually we may instead bundle them, but only if we can
preserve the bundles as parallel copies during regalloc.

llvm-svn: 179449
2013-04-13 06:07:40 +00:00
Andrew Trick 52b8387fd1 Catch another case where SD fails to propagate node order.
I need to handle this for the test case in my following scheduler
commit.

Work is already under way to redesign the mechanism for node order
propagation because this case by case approach is unmaintainable.

llvm-svn: 179448
2013-04-13 06:07:36 +00:00
Akira Hatanaka a6bbde5839 [mips] Move MipsTargetLowering::lowerINTRINSIC_W_CHAIN and
lowerINTRINSIC_WO_CHAIN into MipsSETargetLowering.

No functionality changes.

llvm-svn: 179444
2013-04-13 02:13:30 +00:00
Rafael Espindola 9b709259e1 Finish templating MachObjectFile over endianness.
We are now able to handle big endian macho files in llvm-readobject. Thanks to
David Fang for providing the object files.

llvm-svn: 179440
2013-04-13 01:45:40 +00:00
Akira Hatanaka 2f08822f9d [mips] Reapply r179420 and r179421.
llvm-svn: 179434
2013-04-13 00:55:41 +00:00
Akira Hatanaka 48996b0608 [mips] Override TargetLoweringBase::isShuffleMaskLegal.
llvm-svn: 179433
2013-04-13 00:45:02 +00:00
Chad Rosier 43554eed5e [ms-inline asm] Simplify the logic by using parsePrimaryExpr. No functional
change intended.  Test case previously added in r178568.
Part of rdar://13611297

llvm-svn: 179425
2013-04-12 23:03:20 +00:00
Akira Hatanaka 8ed2892c1c Revert r179420 and r179421.
llvm-svn: 179422
2013-04-12 22:40:07 +00:00
Akira Hatanaka 931ad87f6a [mips] Instruction selection patterns for carry-setting and using add
instructions.

llvm-svn: 179421
2013-04-12 22:24:52 +00:00
Akira Hatanaka 8f41dd923e [mips] v4i8 and v2i16 add, sub and mul instruction selection patterns.
llvm-svn: 179420
2013-04-12 22:14:24 +00:00
Nadav Rotem 4e4d45e507 Revert r179409 because it caused some warnings and some of the build bots fail.
llvm-svn: 179418
2013-04-12 22:02:26 +00:00
Benjamin Kramer e89c705030 InstCombine: Check the operand types before merging fcmp ord & fcmp ord.
Fixes PR15737.

llvm-svn: 179417
2013-04-12 21:56:23 +00:00
Nadav Rotem 8543ba3e52 SLPVectorizer: add support for vectorization of diamond shaped trees. We now perform a preliminary traversal of the graph to collect values with multiple users and check where the users came from.
llvm-svn: 179414
2013-04-12 21:16:54 +00:00
Nadav Rotem 87a0af6e1b CostModel: increase the default cost of supported floating point operations from 1 to two. Fixed a few tests that changes because now the cost of one insert + a vector operation on two doubles is lower than two scalar operations on doubles.
llvm-svn: 179413
2013-04-12 21:15:03 +00:00
Nadav Rotem 4da0ab1d68 Add debug prints.
llvm-svn: 179412
2013-04-12 21:11:14 +00:00
Nadav Rotem e4b8aa001c Add support for additional vector instructions in the interpreter.
patch by Veselov, Yuri <Yuri.Veselov@intel.com>.

llvm-svn: 179409
2013-04-12 20:45:20 +00:00
Chad Rosier d383db5172 [ms-inline asm] Move this logic into a static function as it's only applicable
when parsing MS-style inline assembly.  No functional change intended.

llvm-svn: 179407
2013-04-12 20:20:54 +00:00
Chad Rosier e9902d8325 [ms-inline asm] Address the FIXME for ImmDisp before brackets. This
is a follow on to r179393 and r179399.  Test case to be added on
the clang side.
Part of rdar://13453209

llvm-svn: 179403
2013-04-12 19:51:49 +00:00
Chad Rosier 152749ce80 [ms-inline asm] Have the [ Symbol ] case fall into the more general logic. This
is a follow on to r179393.  Test case to be added on the clang side.
Part of rdar://13453209

llvm-svn: 179399
2013-04-12 18:54:20 +00:00
Quentin Colombet c313220b18 ARM: Correct printing of pre-indexed operands.
According to the ARM reference manual, constant offsets are mandatory for pre-indexed addressing modes.
The MC disassembler was not obeying this when the offset is 0.
It was producing instructions like: str r0, [r1]!.
Correct syntax is: str r0, [r1, #0]!.

This change modifies the dumping of operands so that the offset is always printed, regardless of its value, when pre-indexed addressing mode is used.

Patch by Mihail Popa <Mihail.Popa@arm.com>

llvm-svn: 179398
2013-04-12 18:47:25 +00:00
Chad Rosier 175d0aeef3 [ms-inline asm] Add support for operands that include both a symbol and an
immediate displacement.  Specifically, add support for generating the proper IR.
We've been able to parse this for some time now.  Test case to be added on the
clang side.
Part of rdar://13453209

llvm-svn: 179393
2013-04-12 18:21:18 +00:00
Hal Finkel 1b58f335ca PPC: Remove (broken) nested implicit definition lists
TableGen will not combine nested list 'let' bindings into a single list, and
instead uses only the inner scope. As a result, several instruction definitions
were missing implicit register defs that were in outer scopes. This de-nests
these scopes and makes all instructions have only one let binding which sets
implicit register definitions.

llvm-svn: 179392
2013-04-12 18:17:57 +00:00
Hal Finkel 2277196f64 Add a comment about the PPC Interpretation64Bit bit
llvm-svn: 179391
2013-04-12 18:17:38 +00:00