Commit Graph

64597 Commits

Author SHA1 Message Date
Craig Topper ef9e993eaa Remove x86_sse42_crc32_64_8 intrinsic. It has no functional difference from x86_sse42_crc32_32_8 and was not mapped to a clang builtin. I'm not even sure why this form of the instruction is even called out explicitly in the docs. Also add AutoUpgrade support to convert it into the other intrinsic with appropriate trunc and zext.
llvm-svn: 192672
2013-10-15 05:20:47 +00:00
Andrew Trick 3a99693c5a Improve on r192635, ExeDepsFix for avx, and add a test case.
rdar:15221834 False AVX register dependencies cause 5x slowdown on
flops-5/6 and significant slowdown on several others.

This was blocking the switch to MI-Sched.

llvm-svn: 192669
2013-10-15 03:39:43 +00:00
Akira Hatanaka 06aff571a3 [mips] Define a pseudo instruction which writes to both the lower and higher
parts of the accumulators and gets expanded post-RA.

llvm-svn: 192667
2013-10-15 01:48:30 +00:00
Akira Hatanaka ec67c90216 [mips] Use predicates to guard instructions using accumulator registers instead
of relying on AddedComplexity.

llvm-svn: 192665
2013-10-15 01:21:37 +00:00
Akira Hatanaka d98c99fd01 [mips] Rename isel nodes.
llvm-svn: 192663
2013-10-15 01:12:50 +00:00
Akira Hatanaka 86c3c794fa [mips] Transfer kill flag to the newly created operand.
llvm-svn: 192662
2013-10-15 01:06:30 +00:00
Akira Hatanaka 8368b3b3df [mips] Set HI/LO registers' HWEncoding field.
llvm-svn: 192661
2013-10-15 01:00:00 +00:00
Akira Hatanaka 8f31b2fd3b [mips] Delete unnecessary code.
llvm-svn: 192660
2013-10-15 00:48:42 +00:00
Michael Gottesman 53c885c37a Update comment list of GLOBALVAR modifiers in BitcodeWriter to include externally_initialized.
Thanks to Shuxin Yang for catching this.

llvm-svn: 192637
2013-10-14 22:36:51 +00:00
Quentin Colombet 778dba1dd8 [X86][FastISel] During X86 fastisel, the address of indirect call was resolved
through bitcast, ptrtoint, and inttoptr instructions. This is valid
only if the related instructions are in that same basic block, otherwise
we may reference variables that were not live accross basic blocks
resulting in undefined virtual registers.

The bug was exposed when both SDISel and FastISel were used within the same
function, i.e., one basic block is issued with FastISel and another with SDISel,
as demonstrated with the testcase.

<rdar://problem/15192473>

llvm-svn: 192636
2013-10-14 22:32:09 +00:00
Andrew Trick b6d56be69d Fix the ExecutionDepsFix pass to handle AVX instructions.
This pass is needed to break false dependencies. Without it, unlucky
register assignment can result in wild (5x) swings in
performance. This pass was trying to handle AVX but not getting it
right. AVX doesn't have partial register defs, it has unused register
reads in which the high bits of a source operand are copied into the
unused bits of the dest.

Fixing this requires conservative liveness analysis. This is awkard
because the pass already has its own pseudo-liveness. However, proper
liveness is expensive, and we would like to use a generic utility to
compute it. The fix only invokes liveness on-demand. It is rare to
detect a case that needs undef-read dependence breaking, but when it
happens, it can be needed many times within a very large block.

I think the existing heuristic which uses a register window of 16 is
too conservative for loop-carried false dependencies. If the loop is a
reduction. The out-of-order engine may be able to execute several loop
iterations in parallel. However, I'll leave this tuning exercise for
next time.

llvm-svn: 192635
2013-10-14 22:19:03 +00:00
Andrew Trick e2f7cc4cf3 LiveRegUnits: Use *MBB for consistency and convenience.
llvm-svn: 192634
2013-10-14 22:18:59 +00:00
Andrew Trick 8460a3bfa1 whitespace
llvm-svn: 192633
2013-10-14 22:18:56 +00:00
Eric Christopher 740025745b Revert part of a fix from 2010, changes since then:
a) x86-64 TLS has been documented
b) the code path should use movq for the correct relocation
   to be generated.

I've also added a fixme for the test case that we should improve
the code generated, it should look something like is documented
in the tls abi document.

llvm-svn: 192631
2013-10-14 21:52:26 +00:00
Eric Christopher 755711e510 Reformat this routine slightly.
llvm-svn: 192630
2013-10-14 21:52:23 +00:00
Eric Christopher 584d71c6cb Remove some extraneous whitespace.
llvm-svn: 192629
2013-10-14 21:52:18 +00:00
Andrew Trick 3f4d6c6538 LiveRegUnits::removeRegsInMask safety.
Clobbering is exclusive not inclusive on register units.
For liveness, we need to consider all the preserved registers.
e.g. A regmask that clobbers YMM0 may preserve XMM0.
Units are only clobbered when all super-registers are clobbered.

llvm-svn: 192623
2013-10-14 20:45:19 +00:00
Andrew Trick 276dd453f0 Use a SparseSet in LiveRegUnits.
Some clients may add block live ins and may track liveness over a
large scope. This guarantees an efficient implementation in all cases
with no memory allocation/deallocation, independent of the number of
target registers. It could be slightly less convenient but is fine in
the expected case.

llvm-svn: 192622
2013-10-14 20:45:17 +00:00
Andrew Trick 0aed0cfc44 Move LiveRegUnits implementation into .cpp. Comment and format.
llvm-svn: 192621
2013-10-14 20:45:14 +00:00
Andrew Trick ff3585c51c Convert LiveRegUnits methods to the current convention (it's new code).
llvm-svn: 192619
2013-10-14 20:45:09 +00:00
Manman Ren c6b6392794 Debug Info: static member DIE creation.
Clean up creation of static member DIEs. We can create static member DIEs from
two places, so we call getOrCreateStaticMemberDIE from the two places.

getOrCreateStaticMemberDIE will get or create the context DIE first, then it
will check if the DIE already exists, if not, we create the static member DIE
and add it to the context.

Creation of static member DIEs are handled in a similar way as subprogram DIEs.

llvm-svn: 192618
2013-10-14 20:33:57 +00:00
David Blaikie 6004dbc9fa Fix indenting.
That wasn't confusing /at all/...

llvm-svn: 192617
2013-10-14 20:15:04 +00:00
Will Dietz 5cb7f4e3f2 MachineSink: Fix and tweak critical-edge breaking heuristic.
Per original comment, the intention of this loop
is to go ahead and break the critical edge
(in order to sink this instruction) if there's
reason to believe doing so might "unblock" the
sinking of additional instructions that define
registers used by this one.  The idea is that if
we have a few instructions to sink "together"
breaking the edge might be worthwhile.

This commit makes a few small changes
to help better realize this goal:

First, modify the loop to ignore registers
defined by this instruction.  We don't
sink definitions of physical registers,
and sinking an SSA definition isn't
going to unblock an upstream instruction.

Second, ignore uses of physical registers.
Instructions that define physical registers are
rejected for sinking, and so moving this one
won't enable moving any defining instructions.
As an added bonus, while virtual register
use-def chains are generally small due
to SSA goodness, iteration over the uses
and definitions (used by hasOneNonDBGUse)
for physical registers like EFLAGS
can be rather expensive in practice.
(This is the original reason for looking at this)

Finally, to keep things simple continue
to only consider this trick for registers that
have a single use (via hasOneNonDBGUse),
but to avoid spuriously breaking critical edges
only do so if the definition resides
in the same MBB and therefore this one directly
blocks it from being sunk as well.
If sinking them together is meant to be,
let the iterative nature of this pass
sink the definition into this block first.

Update tests to accomodate this change,
add new testcase where sinking avoids pipeline stalls.

llvm-svn: 192608
2013-10-14 16:57:17 +00:00
Rafael Espindola 8c1d78ad51 Remove lib/Transforms/Instrumentation/ProfilingUtils.*
They were leftover from the old profiling support.

Patch by Alastair Murray.

llvm-svn: 192605
2013-10-14 16:46:46 +00:00
Rafael Espindola 9770bde505 Remove the now unused strong phi elimination pass.
llvm-svn: 192604
2013-10-14 16:39:04 +00:00
Chris Lattner 94fc4bed1f Basic blocks typically have few predecessors. Use a SmallDenseMap to
avoid a heap allocation when this is the case.

llvm-svn: 192602
2013-10-14 16:05:55 +00:00
Evgeniy Stepanov be83d8f693 [msan] Instrument x86.*_cvt* intrinsics.
Currently MSan checks that arguments of *cvt* intrinsics are fully initialized.
That's too much to ask: some of them only operate on lower half, or even
quarter, of the input register.

llvm-svn: 192599
2013-10-14 15:16:25 +00:00
Chad Rosier d1f40d760a [AArch64] Add support for NEON scalar integer compare instructions.
llvm-svn: 192596
2013-10-14 14:37:20 +00:00
Bernard Ogden 53169762d0 Add Cortex-A57 support
llvm-svn: 192591
2013-10-14 13:17:07 +00:00
Bernard Ogden 4400cde89a Add subtarget feature support for Cortex-A53
Some previous implicit defaults have changed, for example FP and NEON
are now on by default.

llvm-svn: 192590
2013-10-14 13:16:57 +00:00
Matheus Almeida 2102188cfc [mips][msa] Direct Object Emission support for BIT instructions.
List of instructions:
bclri.{b,h,w,d}
binsli.{b,h,w,d}
binsri.{b,h,w,d}
bnegi.{b,h,w,d}
bseti.{b,h,w,d}
sat_s.{b,h,w,d}
sat_u.{b,h,w,d}
slli.{b,h,w,d}
srai.{b,h,w,d}
srari.{b,h,w,d}
srli.{b,h,w,d}
srlri.{b,h,w,d}

llvm-svn: 192589
2013-10-14 13:07:39 +00:00
Matheus Almeida 5be0cd8720 [mips][msa] Direct Object Emission support for VEC instructions.
List of instructions:
and.v, bmnz.v, bmz.v, bsel.v, nor.v, or.v, xor.v.

llvm-svn: 192588
2013-10-14 12:57:18 +00:00
Matheus Almeida 7d65ea76ea [mips][msa] Direct Object Emission of INSVE.{b,h,w,d}.
llvm-svn: 192587
2013-10-14 12:38:17 +00:00
Matheus Almeida bc189eb318 [mips][msa] Direct Object Emission for the majority of the ELM instructions.
List of instructions:
copy_s.{b,h,w}
copy_u.{b,h,w}
sldi.{b,h,w,d}
splati.{b,h,w,d}

llvm-svn: 192586
2013-10-14 12:22:43 +00:00
Matheus Almeida b74293dc55 [mips][msa] Direct Object Emission of INSERT.{B,H,W} instruction.
INSERT is the first type of MSA instruction that requires a change to the way
MSA registers are parsed. This happens because MSA registers may be suffixed by
an index in the form of an immediate or a general purpose register. The changes
to parseMSARegs reflect that requirement.

llvm-svn: 192582
2013-10-14 11:49:30 +00:00
Evgeniy Stepanov 9b5517b127 [msan] Fix handling of scalar select of vectors.
llvm-svn: 192575
2013-10-14 09:52:09 +00:00
Elena Demikhovsky 82a46ebe0a Fixed a bug in dynamic allocation memory on stack.
The alignment of allocated space was wrong, see Bugzila 17345.

Done by Zvi Rackover <zvi.rackover@intel.com>.

llvm-svn: 192573
2013-10-14 07:26:51 +00:00
Craig Topper d7abdb6f12 Create classes to reduce the size of the tablegen entries for the CRC32 instructions.
llvm-svn: 192568
2013-10-14 05:19:58 +00:00
Craig Topper a422b09ae3 Allow pinsrw/pinsrb/pextrb/pextrw/movmskps/movmskpd/pmovmskb/extractps instructions to parse either GR32 or GR64 without resorting to duplicating instructions.
llvm-svn: 192567
2013-10-14 04:55:01 +00:00
Craig Topper 4432208884 Add disassembler support for SSE4.1 register/register form of PEXTRW. There is a shorter encoding that was part of SSE2, but a memory form was added in SSE4.1. This is the register form of that encoding.
llvm-svn: 192566
2013-10-14 01:42:32 +00:00
Craig Topper 7158745e55 Mark MOVMSKPS/MOVMSKPD/VPINSRWrr64i as AsmParserOnly to remove them from the disassembler tables. Add PINSRWrr64i to complement the AVX version.
llvm-svn: 192565
2013-10-14 01:21:22 +00:00
David Majnemer 93fdc3fabf Windows: Fix a typo in an assert
llvm-svn: 192564
2013-10-14 01:17:32 +00:00
Craig Topper c4a5a3f65d Don't use 64-bit versions of MOVMSKPD in CodeGen. The instructions only produce a 1-bit result so we can just use SUBREG_TO_REG to extend the 32-bit versions.
llvm-svn: 192562
2013-10-14 00:24:33 +00:00
David Majnemer 7af18578f8 Windows: Don't bother with pinning Kernel32.dll
We don't delay load it so it shouldn't be going anywhere.

llvm-svn: 192561
2013-10-14 00:06:58 +00:00
Will Dietz 5357df6290 MC: Don't assume incoming StringRef's are null terminated.
This can happen when processing command line arguments, which
are often stored as std::string's and later turned into
StringRef's via std::string::data().  Unfortunately this
is not guaranteed to return a null-terminated string
until C++11, causing breakage on platforms that don't do this.

llvm-svn: 192558
2013-10-13 22:09:26 +00:00
Vincent Lejeune d6cbede9c5 R600: improve dump of S_WAITCNT
llvm-svn: 192557
2013-10-13 17:56:28 +00:00
Vincent Lejeune 4ee6dd6136 R600/SI: Add SinkingPass before ISel
llvm-svn: 192556
2013-10-13 17:56:21 +00:00
Vincent Lejeune d623644d17 R600/SI: Support byval arguments
llvm-svn: 192555
2013-10-13 17:56:16 +00:00
Vincent Lejeune fa58a5fb60 R600: Use masked read sel for texture instructions
llvm-svn: 192554
2013-10-13 17:56:10 +00:00
Vincent Lejeune 301beb80d4 R600: fix swizzle export
llvm-svn: 192553
2013-10-13 17:56:04 +00:00