Commit Graph

24457 Commits

Author SHA1 Message Date
Tim Northover 57954f04b3 TMP: LEA64_32r fixing
llvm-svn: 183069
2013-06-01 10:21:54 +00:00
Tim Northover 3a1fd4c0ac X86: change MOV64ri64i32 into MOV32ri64
The MOV64ri64i32 instruction required hacky MCInst lowering because it
was allocated as setting a GR64, but the eventual instruction ("movl")
only set a GR32. This converts it into a so-called "MOV32ri64" which
still accepts a (appropriate) 64-bit immediate but defines a GR32.
This is then converted to the full GR64 by a SUBREG_TO_REG operation,
thus keeping everyone happy.

This fixes a typo in the opcode field of the original patch, which
should make the legact JIT work again (& adds test for that problem).

llvm-svn: 183068
2013-06-01 09:55:14 +00:00
Venkatraman Govindaraju 3521dcdcc4 [Sparc] Generate correct code for leaf functions with stack objects
llvm-svn: 183067
2013-06-01 04:51:18 +00:00
Ahmed Bougacha b1a4d9da3b Make SubRegIndex size mandatory, following r183020.
This also makes TableGen able to compute sizes/offsets of synthesized
indices representing tuples.

llvm-svn: 183061
2013-05-31 23:45:26 +00:00
Eric Christopher e1e57e5ebd Temporarily Revert "X86: change MOV64ri64i32 into MOV32ri64" as it
seems to have caused PR16192 and other JIT related failures.

llvm-svn: 183059
2013-05-31 23:30:45 +00:00
Benjamin Kramer fae7ff12d2 NVPTX: Don't even create a regalloc if we're not going to use it.
Fixes a leak found by valgrind.

llvm-svn: 183031
2013-05-31 19:21:58 +00:00
Ahmed Bougacha f1ed334d55 Add a way to define the bit range covered by a SubRegIndex.
NOTE: If this broke your out-of-tree backend, in *RegisterInfo.td, change
the instances of SubRegIndex that have a comps template arg to use the
ComposedSubRegIndex class instead.

In TableGen land, this adds Size and Offset attributes to SubRegIndex,
and the ComposedSubRegIndex class, for which the Size and Offset are
computed by TableGen. This also adds an accessor in MCRegisterInfo, and
Size/Offsets for the X86 and ARM subreg indices.

llvm-svn: 183020
2013-05-31 17:08:36 +00:00
Tim Northover 4d14144024 ARM: permit upper-case BE/LE on setend instruction
Patch by Amaury de la Vieuville.

llvm-svn: 183012
2013-05-31 15:58:45 +00:00
Tim Northover 4173e29a98 ARM: add fstmx and fldmx instructions for assembly
These instructions are deprecated oddities, but we still need to be able to
disassemble (and reassemble) them if and when they're encountered.

Patch by Amaury de la Vieuville.

llvm-svn: 183011
2013-05-31 15:55:51 +00:00
Tim Northover 1bb672da81 ARM: fix VEXT encoding corner case
The disassembly of VEXT instructions was too lax in the bits checked. This
fixes the case where the instruction affects Q-registers but a misaligned lane
was specified (should be UNDEFINED).

Patch by Amaury de la Vieuville

llvm-svn: 183003
2013-05-31 13:47:25 +00:00
Richard Sandiford 30efd87f6e [SystemZ] Don't use LOAD and STORE REVERSED for volatile accesses
Unlike most -- hopefully "all other", but I'm still checking -- memory
instructions we support, LOAD REVERSED and STORE REVERSED may access
the memory location several times.  This means that they are not suitable
for volatile loads and stores.

This patch is a prerequisite for better atomic load and store support.
The same principle applies there: almost all memory instructions we
support are inherently atomic ("block concurrent"), but LOAD REVERSED
and STORE REVERSED are exceptions.

Other instructions continue to allow volatile operands.  I will add
positive "allows volatile" tests at the same time as the "allows atomic
load or store" tests.

llvm-svn: 183002
2013-05-31 13:25:22 +00:00
Justin Holewinski dbb3b2f4b6 [NVPTX] Re-enable support for virtual registers in the final output
Now that 3.3 is branched, we are re-enabling virtual registers to help
iron out bugs before the next release. Some of the post-RA passes do
not play well with virtual registers, so we disable them for now. The
needed functionality of the PrologEpilogInserter pass is copied to a
new backend-specific NVPTXPrologEpilog pass.

The test for this commit is not breaking the existing tests.

llvm-svn: 182998
2013-05-31 12:14:49 +00:00
Tim Northover d4736d67f4 X86: change MOV64ri64i32 into MOV32ri64
The MOV64ri64i32 instruction required hacky MCInst lowering because it was
allocated as setting a GR64, but the eventual instruction ("movl") only set a
GR32. This converts it into a so-called "MOV32ri64" which still accepts a
(appropriate) 64-bit immediate but defines a GR32. This is then converted to
the full GR64 by a SUBREG_TO_REG operation, thus keeping everyone happy.

llvm-svn: 182991
2013-05-31 09:57:13 +00:00
Akira Hatanaka 2bf97336af [mips] Big-endian code generation for atomic instructions.
Patch by Jyun-Yan You.

llvm-svn: 182984
2013-05-31 03:25:44 +00:00
Rafael Espindola 99bd2ae479 Revert r182937 and r182877.
r182877 broke MCJIT tests on ARM and r182937 was working around another failure
by r182877.

This should make the ARM bots green.

llvm-svn: 182960
2013-05-30 20:37:52 +00:00
Tim Northover 64ec0ff433 X86: use sub-register sequences for MOV*r0 operations
Instead of having a bunch of separate MOV8r0, MOV16r0, ... pseudo-instructions,
it's better to use a single MOV32r0 (which will expand to "xorl %reg, %reg")
and obtain other sizes with EXTRACT_SUBREG and SUBREG_TO_REG. The encoding is
smaller and partial register updates can sometimes be avoided.

Until recently, this sequence was a barrier to rematerialization though. That
should now be fixed so it's an appropriate time to make the change.

llvm-svn: 182928
2013-05-30 13:19:42 +00:00
Justin Holewinski 994d66a345 [NVPTX] Fix case where a sext load of an i1 type may produce an
ld.u1 instead of an ld.u8.

llvm-svn: 182924
2013-05-30 12:22:39 +00:00
Tim Northover 04eb4234fc X86: change zext moves to use sub-register infrastructure.
32-bit writes on amd64 zero out the high bits of the corresponding 64-bit
register. LLVM makes use of this for zero-extension, but until now relied on
custom MCLowering and other code to fixup instructions. Now we have proper
handling of sub-registers, this can be done by creating SUBREG_TO_REG
instructions at selection-time.

Should be no change in functionality.

llvm-svn: 182921
2013-05-30 10:43:18 +00:00
Richard Sandiford 46af5a2cdc [SystemZ] Enable unaligned accesses
The code to distinguish between unaligned and aligned addresses was
already there, so this is mostly just a switch-on-and-test process.

llvm-svn: 182920
2013-05-30 09:45:42 +00:00
Andrew Trick ad6d08ac6f Order CALLSEQ_START and CALLSEQ_END nodes.
Fixes PR16146: gdb.base__call-ar-st.exp fails after
pre-RA-sched=source fixes.

Patch by Xiaoyi Guo!

This also fixes an unsupported dbg.value test case. Codegen was
previously incorrect but the test was passing by luck.

llvm-svn: 182885
2013-05-29 22:03:55 +00:00
Ahmed Bougacha 00e08db393 X86: Fix Defs/Uses for insts that imp-def/imp-use both an A-register and EFLAGS.
This corrects a problem where x86 instructions that implicitly define/use both
an A-register (RAX, EAX, ..) and EFLAGS were declared as only defining/using
EFLAGS, because the outer "let Defs/Uses = [EFLAGS]" in the various multiclasses
overrides the "let Defs/Uses = [areg]" in BinOpAI.

The instructions deriving from BinOpAI were moved out of the "let Defs", and a
BinOpAI_FF class was created, for instructions that implicitly define and use
EFLAGS and the A-register (SBC, ADC).

llvm-svn: 182883
2013-05-29 21:13:57 +00:00
Chad Rosier 33b736626e Don't assume the registers will be enumerated sequentially.
llvm-svn: 182879
2013-05-29 20:42:21 +00:00
JF Bastien f60e0e44ca Enable FastISel on ARM for Linux and NaCl
FastISel was only enabled for iOS ARM and Thumb2, this patch enables it
for ARM (not Thumb2) on Linux and NaCl.

Thumb2 support needs a bit more work, mainly around register class
restrictions.

The patch punts to SelectionDAG when doing TLS relocation on non-Darwin
targets. I will fix this and other FastISel-to-SelectionDAG failures in
a separate patch.

The patch also forces FastISel to retain frame pointers: iOS always
keeps them for backtracking (so emitted code won't change because of
this), but Linux was getting much worse code that was incorrect when
using big frames (such as test-suite's lencod). I'll also fix this in a
later patch, it will probably require a peephole so that FastISel
doesn't rematerialize frame pointers back-to-back.

The test changes are straightforward, similar to:
  http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20130513/174279.html
They also add a vararg test that got dropped in that change.

I ran all of test-suite on A15 hardware with --optimize-option=-O0 and
all the tests pass.

llvm-svn: 182877
2013-05-29 20:38:10 +00:00
Bill Wendling 70b1400e6d Don't reach into the middle of TargetMachine and cache one of its ivars.
Not only does this break encapsulation, it's gross.

llvm-svn: 182876
2013-05-29 20:37:19 +00:00
JF Bastien 13969d0ab6 Tidy some register classes for ARM and Thumb
Tidy up three places where the register class for ARM and Thumb wasn't
restrictive enough:
 - No PC dest for reg-reg add/orr/sub.
 - No PC dest for shifts.
 - No PC or SP for Thumb2 reg-imm add.

I encountered this while combining FastISel with
-verify-machineinstrs. These instructions defined registers whose
classes weren't restrictive enough, and the uses failed
verification. They're also undefined in the ISA, or would produce code
that FastISel wouldn't want. This doesn't fix the register class
narrowing issue (where uses should restrict definitions), and isn't
thorough, but it's a small step in the right direction.

llvm-svn: 182863
2013-05-29 15:45:47 +00:00
NAKAMURA Takumi dbd3bbe126 SparcFrameLowering.cpp: Mark verifyLeafProcRegUse() as UNUSED. [-Wunused-function]
llvm-svn: 182850
2013-05-29 12:10:42 +00:00
Richard Sandiford e1d9f00f09 [SystemZ] Immediate compare-and-branch support
This patch adds support for the CIJ and CGIJ instructions.

llvm-svn: 182846
2013-05-29 11:58:52 +00:00
Patrik Hagglund ae8faf2e9a Temporary fix to get rid of gcc warning.
llvm-svn: 182832
2013-05-29 07:32:08 +00:00
Venkatraman Govindaraju ca0fe2f57e [Sparc] Add support for leaf functions in sparc backend.
llvm-svn: 182822
2013-05-29 04:46:31 +00:00
Jack Carter 0259300325 Mips assembler: Improve set register alias handling
This patch solves the problem of numeric register values not being accepted:

../set_alias.s:1:11: error: expected valid expression after comma
        .set    r4,$4
                    ^
The parsing of .set directive is changed and handling of symbols in code 
as well to enable this feature. 

The test example is added.

Patch by Vladimir Medic

llvm-svn: 182807
2013-05-28 22:21:05 +00:00
Tim Northover 8a1aa518a3 AArch64: clarify -help message
llvm-svn: 182804
2013-05-28 21:09:39 +00:00
Jyotsna Verma cceafb2d6d Hexagon: Typo fix.
llvm-svn: 182790
2013-05-28 19:01:45 +00:00
Richard Sandiford 0fb90ab0cb [SystemZ] Register compare-and-branch support
This patch adds support for the CRJ and CGRJ instructions.  Support for
the immediate forms will be a separate patch.

The architecture has a large number of comparison instructions.  I think
it's generally better to concentrate on using the "best" comparison
instruction first and foremost, then only use something like CRJ if
CR really was the natual choice of comparison instruction.  The patch
therefore opportunistically converts separate CR and BRC instructions
into a single CRJ while emitting instructions in ISelLowering.

llvm-svn: 182764
2013-05-28 10:41:11 +00:00
Richard Sandiford 53c9efd9c1 [SystemZ] Tweak SystemZInstrInfo::isBranch() interface
This is needed for the upcoming compare-and-branch patch.  No functional
change intended.

llvm-svn: 182762
2013-05-28 10:13:54 +00:00
Rafael Espindola f30f2cce50 Make helper functions static.
And remove header and cpp file that are empty after that.

llvm-svn: 182746
2013-05-27 22:34:59 +00:00
Preston Gurd 048f99de11 Convert sqrt functions into sqrt instructions when -ffast-math is in effect.
When -ffast-math is in effect (on Linux, at least), clang defines
__FINITE_MATH_ONLY__ > 0 when including <math.h>. This causes the
preprocessor to include <bits/math-finite.h>, which renames the sqrt functions.
For instance, "sqrt" is renamed as "__sqrt_finite". 

This patch adds the 3 new names in such a way that they will be treated
as equivalent to their respective original names.

llvm-svn: 182739
2013-05-27 15:44:35 +00:00
Hal Finkel 8ebfe6c263 PPC: Add a isConsecutiveLS utility function
isConsecutiveLS is a slightly more general form of
SelectionDAG::isConsecutiveLoad. Aside from also handling stores, it also does
not assume equality of the chain operands is necessary. In the case of the PPC
backend, this chain condition is checked in a more general way by the
surrounding code.

Mostly, this part of the refactoring in preparation for supporting optimized
unaligned stores.

llvm-svn: 182723
2013-05-27 02:06:39 +00:00
Hal Finkel 7d8a691b5d Prefer to duplicate PPC Altivec loads when expanding unaligned loads
When expanding unaligned Altivec loads, we use the decremented offset trick to
prevent page faults. Unfortunately, if we have a sequence of consecutive
unaligned loads, this leads to suboptimal code generation because the 'extra'
load from the first unaligned load can be combined with the base load from the
second (but only if the decremented offset trick is not used for the first).
Search up and down the chain, through loads and token factors, looking for
consecutive loads, and if one is found, don't use the offset reduction trick.
These duplicate loads are later combined to yield the desired sequence (in the
future, we might want a more-powerful chain search, but that will require some
changes to allow the combiner routines to access the AA object).

This should complete the initial implementation of the optimized unaligned
Altivec load expansion. There is some refactoring that should be done, but
that will happen when the unaligned store expansion is added.

llvm-svn: 182719
2013-05-26 18:08:30 +00:00
Hal Finkel bc2ee4c4e6 PPC: Combine duplicate (offset) lvsl Altivec intrinsics
The lvsl permutation control instruction is a function only of the alignment of
the pointer operand (relative to the 16-byte natural alignment of Altivec
vectors). As a result, multiple lvsl intrinsics where the operands differ by a
multiple of 16 can be combined.

llvm-svn: 182708
2013-05-25 04:05:05 +00:00
Andrew Trick e2431c64bc Track IR ordering of SelectionDAG nodes 3/4.
Remove the old IR ordering mechanism and switch to new one.  Fix unit
test failures.

llvm-svn: 182704
2013-05-25 03:08:10 +00:00
Andrew Trick ef9de2a739 Track IR ordering of SelectionDAG nodes 2/4.
Change SelectionDAG::getXXXNode() interfaces as well as call sites of
these functions to pass in SDLoc instead of DebugLoc.

llvm-svn: 182703
2013-05-25 02:42:55 +00:00
Hal Finkel cf2e908014 PPC: Initial support for permutation-based unaligned Altivec loads
Altivec only directly supports aligned loads, but the loads have a strange
property: If given an unaligned address, they truncate the address to the next
lower aligned address, and load from there.  This property, along with an extra
load and some special-purpose permutation-control instructions that generate
the appropriate permutations from the original unaligned address, allow
efficient lowering of aligned loads. This code uses the trick explained in the
Apple Velocity Engine optimization overview document to prevent the needed
extra load from possibly causing a page fault if the original address happens
to be aligned.

As noted in the FIXMEs, there are several additional optimizations that can be
performed to reduce the cost of these loads even more. These will be
implemented in future commits.

llvm-svn: 182691
2013-05-24 23:00:14 +00:00
Quentin Colombet f482805c28 Follow up of the introduction of MCSymbolizer.
- Ressurect old MCDisassemble API to soften transition.
- Extend MCTargetDesc to set target specific symbolizer.

llvm-svn: 182688
2013-05-24 22:51:52 +00:00
Michael J. Spencer df1ecbd734 Replace Count{Leading,Trailing}Zeros_{32,64} with count{Leading,Trailing}Zeros.
llvm-svn: 182680
2013-05-24 22:23:49 +00:00
Richard Sandiford dc5ed71353 [SystemZ] Improve AsmParser handling of invalid instructions
Previously, an invalid instruction like:

	foo     %r1, %r0

would generate the rather odd error message:

....: error: unknown token in expression
	foo     %r1, %r0
		^

We now get the more informative:

....: error: invalid instruction
	foo     %r1, %r0
	^

The same would happen if an address were used where a register was expected.
We now get "invalid operand for instruction" instead.

llvm-svn: 182644
2013-05-24 14:26:46 +00:00
Richard Sandiford 675f86996a [SystemZ] Improve AsmParser register parsing
The idea is to make sure that:

(1) "register expected" is restricted to cases where ParseRegister()
    is called and the token obviously isn't a register.

(2) "invalid register" is restricted to cases where a register-like "%..."
    sequence is found, but the "..." makes no sense.

(3) the generic "invalid operand for instruction" is used in cases where
    the wrong register type is used (GPR instead of FPR, etc.).

(4) the new "invalid register pair" is used if the register has the right type,
    but is not a valid register pair.

Testing of (1)-(3) is now restricted to regs-bad.s.  It uses a representative
instruction for each register class to make sure that only registers from
that class are accepted.

(4) is tested by both regs-bad.s (which checks all invalid register pairs)
and insn-bad.s (which tests one invalid pair for each instruction that
requires a pair).

While there, I changed "Number" to "Num" for consistency with the
operand class.

llvm-svn: 182643
2013-05-24 14:14:38 +00:00
Benjamin Kramer 534d3a4670 Remove the Copied parameter from MemoryObject::readBytes.
There was exactly one caller using this API right, the others were relying on
specific behavior of the default implementation. Since it's too hard to use it
right just remove it and standardize on the default behavior.

Defines away PR16132.

llvm-svn: 182636
2013-05-24 10:54:58 +00:00
Ahmed Bougacha aa79068157 MC: Disassembled CFG reconstruction.
This patch builds on some existing code to do CFG reconstruction from
a disassembled binary:
- MCModule represents the binary, and has a list of MCAtoms.
- MCAtom represents either disassembled instructions (MCTextAtom), or
  contiguous data (MCDataAtom), and covers a specific range of addresses.
- MCBasicBlock and MCFunction form the reconstructed CFG. An MCBB is
  backed by an MCTextAtom, and has the usual successors/predecessors.
- MCObjectDisassembler creates a module from an ObjectFile using a
  disassembler. It first builds an atom for each section. It can also
  construct the CFG, and this splits the text atoms into basic blocks.

MCModule and MCAtom were only sketched out; MCFunction and MCBB were
implemented under the experimental "-cfg" llvm-objdump -macho option.
This cleans them up for further use; llvm-objdump -d -cfg now generates
graphviz files for each function found in the binary.

In the future, MCObjectDisassembler may be the right place to do
"intelligent" disassembly: for example, handling constant islands is just
a matter of splitting the atom, using information that may be available
in the ObjectFile. Also, better initial atom formation than just using
sections is possible using symbols (and things like Mach-O's
function_starts load command).

This brings two minor regressions in llvm-objdump -macho -cfg:
- The printing of a relocation's referenced symbol.
- An annotation on loop BBs, i.e., which are their own successor.

Relocation printing is replaced by the MCSymbolizer; the basic CFG
annotation will be superseded by more related functionality.

llvm-svn: 182628
2013-05-24 01:07:04 +00:00
Ahmed Bougacha ad1084de84 Add MCSymbolizer for symbolic/annotated disassembly.
This is a basic first step towards symbolization of disassembled
instructions. This used to be done using externally provided (C API)
callbacks. This patch introduces:
- the MCSymbolizer class, that mimics the same functions that were used
  in the X86 and ARM disassemblers to symbolize immediate operands and
  to annotate loads based off PC (for things like c string literals).
- the MCExternalSymbolizer class, which implements the old C API.
- the MCRelocationInfo class, which provides a way for targets to
  translate relocations (either object::RelocationRef, or disassembler
  C API VariantKinds) to MCExprs.
- the MCObjectSymbolizer class, which does symbolization using what it
  finds in an object::ObjectFile. This makes simple symbolization (with
  no fancy relocation stuff) work for all object formats!
- x86-64 Mach-O and ELF MCRelocationInfos.
- A basic ARM Mach-O MCRelocationInfo, that provides just enough to
  support the C API VariantKinds.

Most of what works in otool (the only user of the old symbolization API
that I know of) for x86-64 symbolic disassembly (-tvV) works, namely:
- symbol references: call _foo; jmp 15 <_foo+50>
- relocations:       call _foo-_bar; call _foo-4
- __cf?string:       leaq 193(%rip), %rax ## literal pool for "hello"
Stub support is the main missing part (because libObject doesn't know,
among other things, about mach-o indirect symbols).

As for the MCSymbolizer API, instead of relying on the disassemblers
to call the tryAdding* methods, maybe this could be done automagically
using InstrInfo? For instance, even though PC-relative LEAs are used
to get the address of string literals in a typical Mach-O file, a MOV
would be used in an ELF file. And right now, the explicit symbolization
only recognizes PC-relative LEAs. InstrInfo should have already have
most of what is needed to know what to symbolize, so this can
definitely be improved.

I'd also like to remove object::RelocationRef::getValueString (it seems
only used by relocation printing in objdump), as simply printing the
created MCExpr is definitely enough (and cleaner than string concats).

llvm-svn: 182625
2013-05-24 00:39:57 +00:00
Ulrich Weigand 9948546923 [PowerPC] Remove symbolLo/symbolHi instruction operand types
Now that there is no longer any distinction between symbolLo
and symbolHi operands in either printing, encoding, or parsing,
the operand types can be removed in favor of simply using
s16imm.

This completes the patch series to decouple lo/hi operand part
processing from the particular instruction whose operand it is.

No change in code generation expected from this patch.

llvm-svn: 182618
2013-05-23 22:48:06 +00:00
Ulrich Weigand 41789de165 [PowerPC] Clean up generation of ha16() / lo16() markers
When targeting the Darwin assembler, we need to generate markers ha16() and
lo16() to designate the high and low parts of a (symbolic) immediate.  This
is necessary not just for plain symbols, but also for certain symbolic
expression, typically along the lines of ha16(A - B).  The latter doesn't
work when simply using VariantKind flags on the symbol reference.
This is why the current back-end uses hacks (explicitly called out as such
via multiple FIXMEs) in the symbolLo/symbolHi print methods.

This patch uses target-defined MCExpr codes to represent the Darwin
ha16/lo16 constructs, following along the lines of the equivalent solution
used by the ARM back end to handle their :upper16: / :lower16: markers.
This allows us to get rid of special handling both in the symbolLo/symbolHi
print method and in the common code MCExpr::print routine.  Instead, the
ha16 / lo16 markers are printed simply in a custom print routine for the
target MCExpr types.  (As a result, the symbolLo/symbolHi print methods
can now replaced by a single printS16ImmOperand routine that also handles
symbolic operands.)

The patch also provides a EvaluateAsRelocatableImpl routine to handle
ha16/lo16 constructs.  This is not actually used at the moment by any
in-tree code, but is provided as it makes merging into David Fang's
out-of-tree Mach-O object writer simpler.

Since there is no longer any need to treat VK_PPC_GAS_HA16 and
VK_PPC_DARWIN_HA16 differently, they are merged into a single
VK_PPC_ADDR16_HA (and likewise for the _LO16 types).

llvm-svn: 182616
2013-05-23 22:26:41 +00:00
Tim Northover bc93308489 ARM: implement @llvm.readcyclecounter intrinsic
This implements the @llvm.readcyclecounter intrinsic as the specific
MRC instruction specified in the ARM manuals for CPUs with the Power
Management extensions.

Older CPUs had slightly different methods which may also have to be
implemented eventually, but this should cover all v7 cases.

rdar://problem/13939186

llvm-svn: 182603
2013-05-23 19:11:20 +00:00
Tim Northover cedd48183f ARM: Add Performance Monitor Extensions feature
Performance monitors, including a basic cycle counter, are an official
extension in the ARMv7 specification. This adds support for enabling and
disabling them, orthogonally from CPU selection.

rdar://problem/13939186

llvm-svn: 182602
2013-05-23 19:11:14 +00:00
Tom Stellard 1b086cbcb8 R600: Fix R600ControlFlowFinalizer not considering VTX_READ 128 bit dst reg
Patch by: Vincent Lejeune

https://bugs.freedesktop.org/show_bug.cgi?id=64877

NOTE: This is a candidate for the 3.3 branch.
llvm-svn: 182600
2013-05-23 18:26:42 +00:00
Benjamin Kramer d78bb468bd Move passes from namespace llvm into anonymous namespaces. Sort includes while there.
llvm-svn: 182594
2013-05-23 17:10:37 +00:00
Benjamin Kramer ad5c24f161 More symbols that should be static.
llvm-svn: 182590
2013-05-23 16:09:15 +00:00
Benjamin Kramer e79beacb32 Hexagon: Make helper functions static.
llvm-svn: 182588
2013-05-23 15:43:11 +00:00
Benjamin Kramer 635e368e33 R600: Hide symbols of implementation details.
Also removes an unused function.

llvm-svn: 182587
2013-05-23 15:43:05 +00:00
Aaron Ballman 15f193a1a3 Setting the default value (fixes CRT assertions about uninitialized variable use when doing debug MSVC builds), and fixing coding style.
llvm-svn: 182585
2013-05-23 14:55:00 +00:00
Rafael Espindola 00345fa97b Fix 32 bit build in c++11 mode.
The error was:
error: non-constant-expression cannot be narrowed from type 'long long' to 'long' in initializer list [-Wc++11-narrowing]
        MI.getOperand(6).getImm() & 0x1F,

llvm-svn: 182584
2013-05-23 13:22:30 +00:00
Rafael Espindola 39aca620db Fix a leak on the r600 backend.
This should bring the valgrind bot back to life.

llvm-svn: 182561
2013-05-23 03:31:47 +00:00
Rafael Espindola bd6847fbea clang-format this file.
llvm-svn: 182560
2013-05-23 03:28:39 +00:00
Chad Rosier abdb1d69ab Simplify logic now that r182490 is in place. No functional change intended.
llvm-svn: 182531
2013-05-22 23:17:36 +00:00
Bill Schmidt f88571e027 Change some PowerPC PatLeaf definitions to ImmLeaf for fast-isel.
Using PatLeaf rather than ImmLeaf when defining immediate predicates
prevents simple patterns using those predicates from being recognized
for fast instruction selection.  This patch replaces the immSExt16
PatLeaf predicate with two ImmLeaf predicates, imm32SExt16 and
imm64SExt16, allowing a few more patterns to be recognized (ADDI,
ADDIC, MULLI, ADDI8, and ADDIC8).  Using the new predicates does not
help for LI, LI8, SUBFIC, and SUBFIC8 because these are rejected for
other reasons, but I see no reason to retain the PatLeaf predicate.

No functional change intended, and thus no test cases yet.  This is
preliminary work for enabling fast-isel support for PowerPC.  When
that support is ready, we'll be able to test this function.

llvm-svn: 182510
2013-05-22 20:09:24 +00:00
Nadav Rotem 7b66c47051 X86: Fix a bug in EltsFromConsecutiveLoads. We can't generate new loads without chains.
llvm-svn: 182507
2013-05-22 19:28:41 +00:00
Benjamin Kramer d76cc186fc X86: When expanding PCMPGTQ to PCMPGTD we always want to compare the lower halves as unsigned.
Take #2 on fixing PR15977.

llvm-svn: 182486
2013-05-22 17:01:12 +00:00
Rafael Espindola e3d83fb8c3 Fix use after free (pr16103).
llvm-svn: 182482
2013-05-22 15:31:11 +00:00
Rafael Espindola ebd8e38849 Check that a function starts with llvm. before using GET_FUNCTION_RECOGNIZER.
Fixes a use of uninitialized memory found by asan and valgind.

llvm-svn: 182480
2013-05-22 14:57:42 +00:00
Richard Sandiford 14a4449589 [SystemZ] Rename PSW to CC
Addresses a review comment from Ulrich Weigand.  No functional change intended.

I'm not sure whether the old TODO that this patch touches still holds,
but that's something we'd get to when adding a targetted scheduling
description.

llvm-svn: 182474
2013-05-22 13:38:45 +00:00
Richard Sandiford 03528f346a [SystemZ] Fix thinko in long branch pass
The original version of the pass could underestimate the length of a backward
branch in cases like:

    alignment to N bytes or more
    ...
    relaxable branch A
    ...
 foo: (aligned to M<N bytes)
    ...
 bar: (aligned to N bytes)
    ...
    relaxable branch B to foo

We don't add any misalignment gap for "bar" because N bytes of alignment
had already been reached earlier in the function.  In this case, assuming
that A is relaxed can push "foo" closer to "bar", and make B appear to be
in range.  Similar problems can occur for forward branches.

I don't think it's possible to create blocks with mixed alignments as
things stand, not least because we haven't yet defined getPrefLoopAlignment()
for SystemZ (that would need benchmarking).  So I don't think we can test
this yet.

Thanks to Rafael Espíndola for spotting the bug.

llvm-svn: 182460
2013-05-22 09:57:57 +00:00
David Majnemer 7ea2a52a0c X86: Remove test instructions proceeding shift by immediate instructions
Allow LLVM to take advantage of shift instructions that set the ZF flag,
making instructions that test the destination superfluous.

llvm-svn: 182454
2013-05-22 08:13:02 +00:00
NAKAMURA Takumi 4f328e1c2f R600ISelLowering.cpp: Avoid "using namespace Intrinsic;" to appease MSC. Specify namespaces explicitly here.
MSC is confused about "memcpy" between <cstring> and llvm::Intrinsic::memcpy, when llvm::Intrinsic were exposed.

llvm-svn: 182452
2013-05-22 06:37:31 +00:00
NAKAMURA Takumi 18ca09c1cc R600: Whitespace and untabify.
llvm-svn: 182451
2013-05-22 06:37:25 +00:00
Owen Anderson 616852848a Create an FPOW SDNode opcode def in the target independent .td file rather than in a specific backend.
llvm-svn: 182450
2013-05-22 06:36:09 +00:00
Rafael Espindola 21ea01d132 Attempt to fix the mingw32 bot.
This should hopefully fix
http://lab.llvm.org:8011/builders/clang-x86_64-darwin11-self-mingw32

llvm-svn: 182446
2013-05-22 02:30:47 +00:00
Rafael Espindola 525cf28652 s/u_int32_t/uint32_t/
llvm-svn: 182444
2013-05-22 01:36:19 +00:00
Rafael Espindola f568827654 Fix warning in non-assert build.
llvm-svn: 182443
2013-05-22 01:29:38 +00:00
Reed Kotler c6c7e4a67c Mips16 does not use register scavenger from TargetRegisterInfo. It allocates
a RegScavenger object on it's own.
 

llvm-svn: 182430
2013-05-21 22:06:02 +00:00
Akira Hatanaka be76cd0b8e [mips] Rename option to make it compatible with gcc.
llvm-svn: 182397
2013-05-21 17:17:59 +00:00
Akira Hatanaka 6871031be9 [mips] Add instruction selection patterns for blez and bgez.
llvm-svn: 182396
2013-05-21 17:13:47 +00:00
Justin Holewinski 48f4ad3fc0 [NVPTX] Add @llvm.nvvm.sqrt.f() intrinsic
llvm-svn: 182394
2013-05-21 16:51:30 +00:00
Jyotsna Verma 1b056e422c Hexagon: SelectionDAG should not use MVT::Other to check the legality of BR_CC.
llvm-svn: 182390
2013-05-21 15:54:32 +00:00
Hal Finkel c5211291f1 Fix PPC branch selection for counter-based branches
Although I had added some support for the BDZ/BDNZ branches into the selector
(in r158204), I had not correctly adjusted the condition at the top of the
loop. As a result, these branches were still essentially unsupported.

This fixes PR16086. Unfortunately, any test case would be very large (because
it would need to force the loop backedge to exceed the range of the 16-bit
immediate).

llvm-svn: 182385
2013-05-21 14:21:09 +00:00
Elena Demikhovsky 0dd4025ae9 removed commented lines
llvm-svn: 182377
2013-05-21 13:27:44 +00:00
Elena Demikhovsky fad029202f Removed SSEPacked domain from all forms (AVX, SSE, signed, unsigned) scalar compare instructions, like COMISS, COMISD.
No functional changes.

llvm-svn: 182371
2013-05-21 12:04:22 +00:00
Benjamin Kramer 18ef6b22b9 X86: When emulating unsigned PCMPGTQ with PCMPGTD, fix the sign bit for the smaller type.
Otherwise we'll get a mix of signed and unsigned compares.
Fixes PR15977.

llvm-svn: 182364
2013-05-21 09:58:54 +00:00
Richard Sandiford 3b105a063f Fix indentation
llvm-svn: 182356
2013-05-21 08:48:24 +00:00
Reed Kotler 0fed8d4ef7 Add some additional functions to the list of helper functions for
pic calls. These need to be there so we don't try and use helper
functions when we call those.

As part of this, make sure that we properly exclude helper functions in pic
mode when indirect calls are involved.

llvm-svn: 182343
2013-05-21 00:50:30 +00:00
Hal Finkel a969df84ab Rename LoopSimplify.h to LoopUtils.h
As discussed, LoopUtils.h is a better name.

llvm-svn: 182314
2013-05-20 20:46:30 +00:00
Akira Hatanaka 5de4416962 [mips] Add (setne $lhs, 0) instruction selection pattern.
llvm-svn: 182307
2013-05-20 18:18:07 +00:00
Akira Hatanaka 1cb024207f [mips] Trap on integer division by zero.
By default, a teq instruction is inserted after integer divide. No divide-by-zero
checks are performed if option "-mnocheck-zero-division" is used.

llvm-svn: 182306
2013-05-20 18:07:43 +00:00
Hal Finkel e6d7c285b3 Remove copied preheader insertion logic from PPCCTRLoops
Now that the preheader insertion logic in LoopSimplify is externally exposed,
use it, and remove the copy-and-pasted version.

No functionality change intended.

llvm-svn: 182300
2013-05-20 16:47:10 +00:00
Justin Holewinski 4c47d87ba6 [NVPTX] Fix mis-use of CurrentFnSym in NVPTXAsmPrinter. This was causing a symbol name error in the output PTX.
llvm-svn: 182298
2013-05-20 16:42:18 +00:00
Justin Holewinski 18f3a1ffe6 [NVPTX] Add programmatic interface to NVVMReflect pass
llvm-svn: 182297
2013-05-20 16:42:16 +00:00
Hal Finkel 0859ef29d5 Rename PPC MTCTRse to MTCTRloop
As the pairing of this instruction form with the bdnz/bdz branches is now
enforced by the verification pass, make it clear from the name that these
are used only for counter-based loops.

No functionality change intended.

llvm-svn: 182296
2013-05-20 16:08:37 +00:00
Hal Finkel 8ca3884147 Add a PPCCTRLoops verification pass
When asserts are enabled, this adds a verification pass for PPC counter-loop
formation. Unfortunately, without sacrificing code quality, there is no better
way of forming counter-based loops except at the (late) IR level. This means
that we need to recognize, at the IR level, anything which might turn into a
function call (or indirect branch). Because this is currently a finite set of
things, and because SelectionDAG lowering is basic-block local, this can be
done. Nevertheless, it is fragile, and failure results in a miscompile. This
verification pass checks that all (reachable) counter-based branches are
dominated by a loop mtctr instruction, and that no instructions in between
clobber the counter register. If these conditions are not satisfied, then an
ICE will be triggered.

In short, this is to help us sleep better at night.

llvm-svn: 182295
2013-05-20 16:08:17 +00:00
Benjamin Kramer 927ca942ce R600: Fix bug detected by GCC warning.
R600TextureIntrinsicsReplacer.cpp:232: warning: the address of ‘ArgsType’ will always evaluate as ‘true’

This doesn't have any effect on the output as a vararg intrinsic behaves the
same way as a non-vararg one.

llvm-svn: 182293
2013-05-20 15:58:43 +00:00
Tom Stellard f1ee716446 R600/SI: Use a multiclass for MUBUF_Load_Helper
This will simplify the instructions and also the pattern definitions.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
llvm-svn: 182288
2013-05-20 15:02:31 +00:00
Tom Stellard b8458f88d6 R600/SI: Add a pattern for S_LOAD_DWORDX2_* instructions
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
llvm-svn: 182287
2013-05-20 15:02:28 +00:00
Tom Stellard d2eebf001e R600/SI: Add pattern for rotr
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
llvm-svn: 182286
2013-05-20 15:02:24 +00:00