Commit Graph

108328 Commits

Author SHA1 Message Date
Chandler Carruth 02387122e0 [x86] Fix an oversight in the v8i32 path of the new vector shuffle
lowering where it only used the mask of the low 128-bit lane rather than
the entire mask.

This allows the new lowering to correctly match the unpack patterns for
v8i32 vectors.

For reference, the reason that we check for the the entire mask rather
than checking the repeated mask is because the repeated masks don't
abide by all of the invariants of normal masks. As a consequence, it is
safer to use the full mask with functions like the generic equivalence
test.

llvm-svn: 218442
2014-09-25 04:10:27 +00:00
Chandler Carruth 8140158cb5 [x86] Rearrange the code for v16i16 lowering a bit for clarity and to
reduce the amount of checking we do here.

The first realization is that only non-crossing cases between 128-bit
lanes are handled by almost the entire function. It makes more sense to
handle the crossing cases first.

THe second is that until we actually are going to generate fancy shared
lowering strategies that use the repeated semantics of the v8i16
lowering, we should waste time checking for repeated masks. It is
simplest to directly test for the entire unpck masks anyways, so we
gained nothing from this.

This also matches the structure of v32i8 more closely.

No functionality changed here.

llvm-svn: 218441
2014-09-25 04:03:22 +00:00
Chandler Carruth d8f528adb8 [x86] Implement AVX2 support for v32i8 in the new vector shuffle
lowering.

This completes the basic AVX2 feature support, but there are still some
improvements I'd like to do to really get the last mile of performance
here.

llvm-svn: 218440
2014-09-25 02:52:12 +00:00
Chandler Carruth 397d12c4b4 [x86] More tweaks to the v32i8 test cases.
I made a mistake in the previous commit and produced the wrong pattern.
Fix that. Also make one more shuffle pattern byte-based rather than
word-based, and add two more blend patterns.

llvm-svn: 218439
2014-09-25 02:44:39 +00:00
Chandler Carruth a03011ffae [x86] Re-work a bunch of the v32i8 test cases to actually involve byte
shuffles rather than word shuffles.

As you might guess, these were built starting from the word shuffle test
cases and I failed to properly port a bunch of them and left them as
widened word shuffle test cases. We still have a couple of tests that
check our ability to widen shuffles, but now we will test the actual
byte shuffle quite a bit better.

llvm-svn: 218438
2014-09-25 02:20:02 +00:00
Reid Kleckner 81782f0cb8 MC: Use @IMGREL instead of @IMGREL32, which we can't parse
Nico Rieck added support for this 32-bit COFF relocation some time ago
for Win64 stuff. It appears that as an oversight, the assembly output
used "foo"@IMGREL32 instead of "foo"@IMGREL, which is what we can parse.

Sadly, there were actually tests that took in IMGREL and put out
IMGREL32, and we didn't notice the inconsistency. Oh well. Now LLVM can
assemble it's own output with slightly more fidelity.

llvm-svn: 218437
2014-09-25 02:09:18 +00:00
Chandler Carruth d355369dbb [x86] Remove the defunct X86ISD::BLENDV entry -- we use vector selects
for this now.

Should prevent folks from running afoul of this and not knowing why
their code won't instruction select the way I just did...

llvm-svn: 218436
2014-09-25 01:16:01 +00:00
Chandler Carruth a577bc26b6 [x86] Fix the v16i16 blend logic I added in the prior commit and add the
missing test cases for it.

Unsurprisingly, without test cases, there were bugs here. Surprisingly,
this bug wasn't caught at compile time. Yep, there is an X86ISD::BLENDV.
It isn't wired to anything. Oops. I'll fix than next.

llvm-svn: 218434
2014-09-25 01:13:38 +00:00
Justin Bogner b35a72ae9e llvm-cov: Combine segments that cover the same location
If we have multiple coverage counts for the same segment, we need to
add them up rather than arbitrarily choosing one. This fixes that and
adds a test with template instantiations to exercise it.

llvm-svn: 218432
2014-09-25 00:34:18 +00:00
Akira Hatanaka 8cc48bd159 [X86,AVX] Add an isel pattern for X86VBroadcast.
This fixes PR21050 and rdar://problem/18434607.

llvm-svn: 218431
2014-09-25 00:26:15 +00:00
Chandler Carruth 98443d89b9 [x86] Implement v16i16 support with AVX2 in the new vector shuffle
lowering.

This also implements the fancy blend lowering for v16i16 using AVX2 and
teaches the X86 backend to print shuffle masks for 256-bit PSHUFB
and PBLENDW instructions. It also makes the mask decoding correct for
PBLENDW instructions. The yaks, they are legion.

Tests are updated accordingly. There are some missing tests for the
VBLENDVB lowering, but I'll add those in a follow-up as this commit has
accumulated enough cruft already.

llvm-svn: 218430
2014-09-25 00:24:19 +00:00
Kevin Enderby bf246f5a9d Flush out enough of llvm-objdump’s SymbolizerSymbolLookUp() for Mach-O files to
get the literal string “Hello world” printed as a comment on the instruction
that loads the pointer to it. For now this is just for x86_64. So for object
files with relocation entries it produces things like:

	leaq	L_.str(%rip), %rax      ## literal pool for: "Hello world\n"

and similar for fully linked images like executables:

	leaq	0x4f(%rip), %rax        ## literal pool for: "Hello world\n"

Also to allow testing against darwin’s otool(1), I hooked up the existing 
-no-show-raw-insn option to the Mach-O parser code, added the new Mach-O
only -full-leading-addr option to match otool(1)'s printing of addresses and
also added the new -print-imm-hex option.

llvm-svn: 218423
2014-09-24 23:08:22 +00:00
Kostya Serebryany 34ddf8725c [asan] don't instrument module CTORs that may be run before asan.module_ctor. This fixes asan running together -coverage
llvm-svn: 218421
2014-09-24 22:41:55 +00:00
Renato Golin 9c4a6d87ec Removing empty ARM tests from failed revert
llvm-svn: 218419
2014-09-24 21:58:04 +00:00
Renato Golin a86bbc37f2 Removing empty tests from failed revert
llvm-svn: 218417
2014-09-24 21:45:26 +00:00
Renato Golin 4b5f91f513 Revert 218406 - Refactor the RelocVisitor::visit method
llvm-svn: 218416
2014-09-24 21:30:43 +00:00
Renato Golin ba89f068bf Revert 218407 - Add support for ARM and AArch64 BE object files
llvm-svn: 218415
2014-09-24 21:30:14 +00:00
Renato Golin d35e6f6aee Revert 218408 - Report endianness in output of {dwarf, obj}dump
llvm-svn: 218414
2014-09-24 21:29:45 +00:00
Renato Golin 2328747ede Revert 218411 - XFAIL reloc test on x86/hexagon
llvm-svn: 218413
2014-09-24 21:28:53 +00:00
Renato Golin 7aa836043f XFAIL reloc test on x86/hexagon
llvm-svn: 218411
2014-09-24 21:00:30 +00:00
Akira Hatanaka 8e77dbbf5a Revert r218380. This was breaking Apple internal build bots.
llvm-svn: 218409
2014-09-24 20:37:14 +00:00
Renato Golin 6f92c6b982 Report endianness in output of {dwarf, obj}dump
For biendian targets like ARM and AArch64, it is useful to have the
output of the llvm-dwarfdump and llvm-objdump report the endianness
used when the object files were generated.

Patch by Charlie Turner.

llvm-svn: 218408
2014-09-24 20:07:41 +00:00
Renato Golin ed654f5852 Add support for ARM and AArch64 BE object files
This change fixes the ARM and AArch64 relocation visitors in
RelocVisitor.  They were unconditionally assuming the object data are
little-endian.  Tests have been added to ensure that the
llvm-dwarfdump utility does not crash when processing big-endian
object files.

Patch by Charlie Turner.

llvm-svn: 218407
2014-09-24 20:07:30 +00:00
Renato Golin 2b25450061 Refactor the RelocVisitor::visit method
This change replaces the brittle if/else chain of string comparisons
with a switch statement on the detected target triple, removing the
need for testing arbitrary architecture names returned from
getFileFormatName, whose primary purpose seems to be for display
(user-interface) purposes. The visitor now takes a reference to the
object file, rather than its arbitrary file format name to figure out
whether the file is a 32 or 64-bit object file and what the detected
target triple is.

A set of tests have been added to help show that the refactoring processes
relocations for the same targets as the original code.

Patch by Charlie Turner.

llvm-svn: 218406
2014-09-24 20:07:22 +00:00
Scott Douglass ae671341c4 pass environment when invoking llvm-config from lit.cfg
Use the same environment when invoking llvm-config from lit.cfg as
will be used when running tests, so that ASAN_OPTIONS, INCLUDE, etc.
are present.

llvm-svn: 218403
2014-09-24 18:37:48 +00:00
Chris Bieneman 7827217131 Adding #ifdef around TermColorMutex based on feedback from Craig Topper.
llvm-svn: 218401
2014-09-24 18:35:58 +00:00
Chandler Carruth edcba62b4a [x86] Factor out the logic to generically decombose a vector shuffle
into unblended shuffles and a blend.

This is the consistent fallback for the lowering paths that have fast
blend operations available, and its getting quite repetitive.

No functionality changed.

llvm-svn: 218399
2014-09-24 18:20:09 +00:00
Kaelyn Takata c4067328cf Revert "Add support for ARM and AArch64 BE object files"
This reverts commit r218389 as it depends on r218388.

llvm-svn: 218398
2014-09-24 18:00:20 +00:00
Kaelyn Takata e43d88e3f5 Revert "Report endianness in output of {dwarf, obj}dump"
This reverts commit r218391 as it depends on r218388 and r218389

llvm-svn: 218397
2014-09-24 18:00:17 +00:00
Kaelyn Takata f2fce14920 Revert "Refactor the RelocVisitor::visit method"
This reverts commit faac033f7364bb4226e22c8079c221c96af10d02.

The test depends on all targets to be enabled in llc in order to pass,
and needs to be rewritten/refactored to not have that dependency.

llvm-svn: 218393
2014-09-24 17:49:07 +00:00
Renato Golin 4edda28b8a Report endianness in output of {dwarf, obj}dump
For biendian targets like ARM and AArch64, it is useful to have the
output of the llvm-dwarfdump and llvm-objdump report the endianness
used when the object files were generated.

Patch by Charlie Turner.

llvm-svn: 218391
2014-09-24 17:01:33 +00:00
Renato Golin 0e92815e94 Add support for ARM and AArch64 BE object files
This change fixes the ARM and AArch64 relocation visitors in
RelocVisitor.  They were unconditionally assuming the object data are
little-endian.  Tests have been added to ensure that the
llvm-dwarfdump utility does not crash when processing big-endian
object files.

Patch by Charlie Turner.

llvm-svn: 218389
2014-09-24 17:01:06 +00:00
Renato Golin 53f6034f8e Refactor the RelocVisitor::visit method
This change replaces the brittle if/else chain of string comparisons
with a switch statement on the detected target triple, removing the
need for testing arbitrary architecture names returned from
getFileFormatName, whose primary purpose seems to be for display
(user-interface) purposes. The visitor now takes a reference to the
object file, rather than its arbitrary file format name to figure out
whether the file is a 32 or 64-bit object file and what the detected
target triple is.

A set of tests have been added to help show that the refactoring processes
relocations for the same targets as the original code.

Patch by Charlie Turner.

llvm-svn: 218388
2014-09-24 17:00:42 +00:00
David Peixotto 0d4d5e64ec Fix assertion in LICM doFinalization()
The doFinalization method checks that the LoopToAliasSetMap is
empty. LICM populates that map as it runs through the loop nest,
deleting the entries for child loops as it goes. However, if a child
loop is deleted by another pass (e.g. unrolling) then the loop will
never be deleted from the map because LICM walks the loop nest to
find entries it can delete.

The fix is to delete the loop from the map and free the alias set
when the loop is deleted from the loop nest.

Differential Revision: http://reviews.llvm.org/D5305

llvm-svn: 218387
2014-09-24 16:48:31 +00:00
Moritz Roth f5d0c7c2c0 [Thumb] Make load/store optimizer less conservative.
If it's safe to clobber the condition flags, we can do a few extra things:
it's then possible to reset the base register writeback using a SUBS, so
we can try to merge even if the base register isn't dead after the merged
instruction.

This is effectively a (heavily bug-fixed) rewrite of r208992.

llvm-svn: 218386
2014-09-24 16:35:50 +00:00
Oliver Stannard 1ae8b476f4 [Thumb] 32-bit encodings of 'cps' are not valid for v7M
v7M only allows the 16-bit encoding of the 'cps' (Change Processor
State) instruction, and does not have the 32-bit encoding which is
valid from v6T2 onwards.

llvm-svn: 218382
2014-09-24 14:20:01 +00:00
Aaron Ballman f086a14d53 Silencing an "enumeral and non-enumeral type in conditional expression" warning. NFC.
llvm-svn: 218381
2014-09-24 13:54:56 +00:00
Benjamin Kramer ce246a13ea Replace a hand-written suffix compare with std::lexicographical_compare.
No functionality change.

llvm-svn: 218380
2014-09-24 13:19:28 +00:00
Chandler Carruth e7e9c04ddf [x86] Teach the instruction lowering to add comments describing constant
pool data being loaded into a vector register.

The comments take the form of:

  # ymm0 = [a,b,c,d,...]
  # xmm1 = <x,y,z...>

The []s are used for generic sequential data and the <>s are used for
specifically ConstantVector loads. Undef elements are printed as the
letter 'u', integers in decimal, and floating point values as floating
point values. Suggestions on improving the formatting or other aspects
of the display are very welcome.

My primary use case for this is to be able to FileCheck test masks
passed to vector shuffle instructions in-register. It isn't fantastic
for that (no decoding special zeroing semantics or other tricks), but it
at least puts the mask onto an instruction line that could reasonably be
checked. I've updated many of the new vector shuffle lowering tests to
leverage this in their test cases so that we're actually checking the
shuffle masks remain as expected.

Before implementing this, I tried a *bunch* of different approaches.
I looked into teaching the MCInstLower code to scan up the basic block
and find a definition of a register used in a shuffle instruction and
then decode that, but this seems incredibly brittle and complex.
I talked to Hal a lot about the "right" way to do this: attach the raw
shuffle mask to the instruction itself in some form of unencoded
operands, and then use that to emit the comments. I still think that's
the optimal solution here, but it proved to be beyond what I'm up for
here. In particular, it seems likely best done by completing the
plumbing of metadata through these layers and attaching the shuffle mask
in metadata which could have fully automatic dropping when encoding an
actual instruction.

llvm-svn: 218377
2014-09-24 09:39:41 +00:00
Michael Liao d120916ca7 Allow BB duplication threshold to be adjusted through JumpThreading's ctor
- BB duplication may not be desired on targets where there is no or small
  branch penalty and code duplication needs restrict control.

llvm-svn: 218375
2014-09-24 04:59:06 +00:00
NAKAMURA Takumi f744ad43e1 Windows/Host.inc: Reformat the header to fit 80-col.
llvm-svn: 218374
2014-09-24 04:45:14 +00:00
NAKAMURA Takumi 239a226dea Unix/Host.inc: Remove <cstdlib>. It has been unused for a long time.
llvm-svn: 218373
2014-09-24 04:45:02 +00:00
NAKAMURA Takumi 12abbdaeab Unix/Host.inc: Wrap a comment line in 80-col.
llvm-svn: 218371
2014-09-24 04:44:50 +00:00
NAKAMURA Takumi 3d238b47ec Unix/Host.inc: Remove leading whitespace. It had been here since r56942!
llvm-svn: 218370
2014-09-24 04:44:37 +00:00
NAKAMURA Takumi d4252f925e valgrind/x86_64-pc-linux-gnu.supp: Suppress also /bin/bash.
llvm-svn: 218369
2014-09-24 04:38:20 +00:00
NAKAMURA Takumi 853a1bf82c valgrind/x86_64-pc-linux-gnu.supp: Tweak /bin/sed to let calloc recognized.
llvm-svn: 218368
2014-09-24 04:38:09 +00:00
Jiangning Liu 3b096172cf Clear PreferredExtendType for in each function-specific state FunctionLoweringInfo.
llvm-svn: 218364
2014-09-24 03:22:56 +00:00
Chandler Carruth 7b688c6884 [x86] More refactoring of the shuffle comment emission. The previous
attempt didn't work out so well. It looks like it will be much better
for introducing extra logic to find a shuffle mask if the finding logic
is totally separate. This also makes it easy to sink the opcode logic
completely out of the routine so we don't re-dispatch across it.

Still no functionality changed.

llvm-svn: 218363
2014-09-24 03:06:37 +00:00
Chandler Carruth edf50212df [x86] Bypass the shuffle mask comment generation when not using verbose
asm. This can be somewhat expensive and there is no reason to do it
outside of tests or debugging sessions. I'm also likely to make it
significantly more expensive to support more styles of shuffles.

llvm-svn: 218362
2014-09-24 03:06:34 +00:00
Chandler Carruth ab8b37a9d2 [x86] Hoist the logic for extracting the relevant bits of information
from the MachineInstr into the caller which is already doing a switch
over the instruction.

This will make it more clear how to compute different operands to feed
the comment selection for example.

Also, in a drive-by-fix, don't append an empty comment string (which is
a no-op ultimately).

No functionality changed.

llvm-svn: 218361
2014-09-24 02:24:41 +00:00
Matt Arsenault 2c41987490 R600/SI: Add new helper isSGPRClassID
Move these into header since they are trivial

llvm-svn: 218360
2014-09-24 02:17:12 +00:00
Matt Arsenault 262407bc2f R600/SI: Fix hardcoded and wrong operand numbers.
Also fix leftover debug printing

llvm-svn: 218359
2014-09-24 02:17:09 +00:00
Matt Arsenault 69612d6027 R600/SI: Enable named operand table for SALU instructions
llvm-svn: 218358
2014-09-24 02:17:06 +00:00
Chandler Carruth 0b682d42de [x86] Start refactoring the comment printing logic in the MC lowering of
vector shuffles.

This is just the beginning by hoisting it into its own function and
making use of early exit to dramatically simplify the flow of the
function. I'm going to be incrementally refactoring this until it is
a bit less magical how this applies to other instructions, and I can
teach it how to dig a shuffle mask out of a register. Then I plan to
hook it up to VPERMD so we get our mask comments for it.

No functionality changed yet.

llvm-svn: 218357
2014-09-24 02:16:12 +00:00
Matt Arsenault 3e0effa223 R600/SI: Fix weird CHECK-DAG usage
This prevents these from failing in a future commit.

llvm-svn: 218356
2014-09-24 02:14:26 +00:00
Tom Stellard 744b99b476 R600/SI: Enable selecting SALU inside branches
We can do this now that the FixSGPRLiveRanges pass is working.

llvm-svn: 218353
2014-09-24 01:33:28 +00:00
Tom Stellard deb3f9e643 R600/SI: Move PHIs that define SGPRs to the VALU in most cases
This fixes a bug that is uncovered by a future commit and will
be tested by the test/CodeGen/R600/sgpr-control-flow.ll test case.

llvm-svn: 218352
2014-09-24 01:33:26 +00:00
Tom Stellard 60024a0558 R600/SI: Fix the FixSGPRLiveRanges pass
The previous implementation was extending the live range of SGPRs
by modifying the live intervals directly.  This was causing a lot
of machine verification errors when the machine scheduler was enabled.

The new implementation adds pseudo instructions with implicit uses to
extend the live ranges of SGPRs, which works much better.

llvm-svn: 218351
2014-09-24 01:33:24 +00:00
Tom Stellard be507fb5d3 R600/SI: Mark EXEC_LO and EXEC_HI as reserved
These registers can be allocated and used like other 32-bit registers,
but it seems like a likely source for bugs.

llvm-svn: 218350
2014-09-24 01:33:23 +00:00
Tom Stellard 9a88593ed0 R600/SI: Fix SIRegisterInfo::getPhysRegSubReg()
Correctly handle special registers: EXEC, EXEC_LO, EXEC_HI, VCC_LO,
VCC_HI, and M0.  The previous implementation would assertion fail
when passed these registers.

llvm-svn: 218349
2014-09-24 01:33:22 +00:00
Tom Stellard 96468903d4 R600/SI: Implement VGPR register spilling for compute at -O0 v3
VGPRs are spilled to LDS.  This still needs more testing, but
we need to at least enable it at -O0, because the fast register
allocator spills all registers that are live at the end of blocks
and without this some future commits will break the
flat-address-space.ll test.

v2: Only calculate thread id once

v3: Move insertion of spill instructions to
    SIRegisterInfo::eliminateFrameIndex()
llvm-svn: 218348
2014-09-24 01:33:17 +00:00
Chandler Carruth 9bd10e7492 [x86] Teach the new vector shuffle lowering to lower v8i32 shuffles with
the native AVX2 instructions.

Note that the test case is really frustrating here because VPERMD
requires the mask to be in the register input and we don't produce
a comment looking through that to the constant pool. I'm going to
attempt to improve this in a subsequent commit, but not sure if I will
succeed.

llvm-svn: 218347
2014-09-24 01:24:44 +00:00
Chandler Carruth fd11815a7d [x86] Fix a really terrible bug in the repeated 128-bin-lane shuffle
detection. It was incorrectly handling undef lanes by actually treating
an undef lane in the first 128-bit lane as a *numeric* shuffle value.

Fortunately, this almost always DTRT and disabled detecting repeated
patterns. But not always. =/ This patch introduces a much more
principled approach and fixes the miscompiles I spotted by inspection
previously.

llvm-svn: 218346
2014-09-24 01:03:57 +00:00
Robin Morisset dc1b248ccf Fix swift-atomics testcase
This testcase was not testing what it meant: because there were only two checks for
dmb {{ish}} in the second function, it could have missed a bug where one of the three
required dmb {{ish}} became dmb {{ishst}}. As I was fixing it, I also added
CHECK-LABELs to make it a bit less brittle.

llvm-svn: 218341
2014-09-23 23:18:01 +00:00
Chandler Carruth df2e421845 [x86] Teach the new vector shuffle lowering to lower v4i64 vector
shuffles using the AVX2 instructions. This is the first step of cutting
in real AVX2 support.

Note that I have spotted at least one bug in the test cases already, but
I suspect it was already present and just is getting surfaced. Will
investigate next.

llvm-svn: 218338
2014-09-23 22:39:02 +00:00
Reid Kleckner 78927e884b GlobalOpt: Preserve comdats of unoptimized initializers
Rather than slurping in and splatting out the whole ctor list, preserve
the existing array entries without trying to understand them.  Only
remove the entries that we know we can optimize away.  This way we don't
need to wire through priority and comdats or anything else we might add.

Fixes a linker issue where the .init_array or .ctors entry would point
to discarded initialization code if the comdat group from the TU with
the faulty global_ctors entry was dropped.

llvm-svn: 218337
2014-09-23 22:33:01 +00:00
Jim Grosbach 57fd2623c3 AArch64: allow constant expressions for shifted reg literals
e.g., add w1, w2, w3, lsl #(2 - 1)

This sort of thing comes up in pre-processed assembly playing macro games.
Still validate that it's an assembly time constant. The early exit error check
was just a bit overzealous and disallowed a left paren.

rdar://18430542

llvm-svn: 218336
2014-09-23 22:16:02 +00:00
Chandler Carruth 9a94bd6fa4 [x86] Teach the rest of the 'target shuffle' machinery about blends and
add VPBLENDD to the InstPrinter's comment generation so we get nice
comments everywhere.

Now that we have the nice comments, I can see the bug introduced by
a silly typo in the commit that enabled VPBLENDD, and have fixed it. Yay
tests that are easy to inspect.

llvm-svn: 218335
2014-09-23 22:14:14 +00:00
Tom Stellard 73ae1cb59a R600/SI: Clean up checks for legality of immediate operands
There are new register classes VCSrc_* which represent operands that
can take an SGPR, VGPR or inline constant.  The VSrc_* class is now used
to represent operands that can take an SGPR, VGPR, or a 32-bit
immediate.

This allows us to have more accurate checks for legality of
immediates, since before we had no way to distinguish between operands
that supported any 32-bit immediate and operands which could only
support inline constants.

llvm-svn: 218334
2014-09-23 21:26:25 +00:00
Robin Morisset 6dbbbc28b0 [X86] Make wide loads be managed by AtomicExpand
Summary:
AtomicExpand already had logic for expanding wide loads and stores on LL/SC
architectures, and for expanding wide stores on CmpXchg architectures, but
not for wide loads on CmpXchg architectures. This patch fills this hole,
and makes use of this new feature in the X86 backend.

Only one functionnal change: we now lose the SynchScope attribute.
It is regrettable, but I have another patch that I will submit soon that will
solve this for all of AtomicExpand (it seemed better to split it apart as it
is a different concern).

Test Plan: make check-all (lots of tests for this functionality already exist)

Reviewers: jfb

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D5404

llvm-svn: 218332
2014-09-23 20:59:25 +00:00
Robin Morisset 2212996936 [Power] Use AtomicExpandPass for fence insertion, and use lwsync where appropriate
Summary:
This patch makes use of AtomicExpandPass in Power for inserting fences around
atomic as part of an effort to remove fence insertion from SelectionDAGBuilder.
As a big bonus, it lets us use sync 1 (lightweight sync, often used by the mnemonic
lwsync) instead of sync 0 (heavyweight sync) in many cases.

I also added a test, as there was no test for the barriers emitted by the Power
backend for atomic loads and stores.

Test Plan: new test + make check-all

Reviewers: jfb

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D5180

llvm-svn: 218331
2014-09-23 20:46:49 +00:00
Robin Morisset dedef3325f Add AtomicExpandPass::bracketInstWithFences, and use it whenever getInsertFencesForAtomic would trigger in SelectionDAGBuilder
Summary:
The goal is to eventually remove all the code related to getInsertFencesForAtomic
in SelectionDAGBuilder as it is wrong (designed for ARM, not really portable, works
mostly by accident because the backends are overly conservative), and repeats the
same logic that goes in emitLeading/TrailingFence.

In this patch, I make AtomicExpandPass insert the fences as it knows better
where to put them. Because this requires getting the fences and not just
passing an IRBuilder around, I had to change the return type of
emitLeading/TrailingFence.
This code only triggers on ARM for now. Because it is earlier in the pipeline
than SelectionDAGBuilder, it triggers and lowers atomic accesses to atomic so
SelectionDAGBuilder does not add barriers anymore on ARM.

If this patch is accepted I plan to implement emitLeading/TrailingFence for all
backends that setInsertFencesForAtomic(true), which will allow both making them
less conservative and simplifying SelectionDAGBuilder once they are all using
this interface.

This should not cause any functionnal change so the existing tests are used
and not modified.

Test Plan: make check-all, benefits from existing tests of atomics on ARM

Reviewers: jfb, t.p.northover

Subscribers: aemerson, llvm-commits

Differential Revision: http://reviews.llvm.org/D5179

llvm-svn: 218329
2014-09-23 20:31:14 +00:00
Lang Hames da01602647 [MCJIT] Fix some more RuntimeDyld debugging output format specifiers.
llvm-svn: 218328
2014-09-23 19:20:57 +00:00
Lang Hames a633a6cdb1 [MCJIT] Remove PPCRelocations.h - it's no longer used.
This was overlooked in r218320, which removed the relocation headers for other
targets. Thanks to Ulrich Weigand for catching it.

llvm-svn: 218327
2014-09-23 19:17:48 +00:00
Robin Morisset a7b357fed1 Just add a fixme about a possibly faster implementation of some atomic loads on some ARM processors
llvm-svn: 218326
2014-09-23 18:33:21 +00:00
Matt Arsenault 4364fef82f Fix typo
llvm-svn: 218324
2014-09-23 18:30:57 +00:00
Chandler Carruth adcfec995c [x86] Teach the new shuffle lowering's blend functionality to use AVX2's
VPBLENDD where appropriate even on 128-bit vectors.

According to Agner's tables, this instruction is significantly higher
throughput (can execute on any port) on Haswell chips so we should
aggressively try to form it when available.

Sadly, this loses our delightful shuffle comments. I'll add those back
for VPBLENDD next.

llvm-svn: 218322
2014-09-23 18:16:12 +00:00
Lang Hames d5f496d57c [MCJIT] Nuke MachineRelocation and MachineCodeEmitter. Now that the old JIT is
gone they're no longer needed.

llvm-svn: 218320
2014-09-23 18:08:47 +00:00
Jingyue Wu c4725da382 [docs] Fixed a typo in Atomics.rst
llvm-svn: 218319
2014-09-23 17:35:28 +00:00
Lang Hames 051742431a [MCJIT] Remove a few more references to JITMemoryManager that survived r218316.
llvm-svn: 218318
2014-09-23 17:10:24 +00:00
Lang Hames 7f19f2281a [MCJIT] Remove #include of JITMemoryManager that accidentally survived r218316.
llvm-svn: 218317
2014-09-23 17:02:24 +00:00
Lang Hames 0f15490bcd [MCJIT] Delete the JTIMemoryManager and associated APIs.
This patch removes the old JIT memory manager (which does not provide any
useful functionality now that the old JIT is gone), and migrates the few
remaining clients over to SectionMemoryManager.

http://llvm.org/PR20848

llvm-svn: 218316
2014-09-23 16:56:02 +00:00
Sanjay Patel 6a42292795 Use SDValue bool operator to reduce code. No functional change.
llvm-svn: 218314
2014-09-23 16:24:20 +00:00
Oliver Stannard c546625c4f Fix segfault in AArch64 backend with -g and -mbig-endian
Fix a null pointer dereference when trying to swap the endianness of
fixups in the .eh_frame section in the AArch64 backend.

llvm-svn: 218311
2014-09-23 15:38:11 +00:00
NAKAMURA Takumi dbf2c21e84 Rework r218304, "ExecutionEngineTests: Call llvm_shutdown() on exit for ManagedStatic introduced in r218151."
r218304 caused crash on msvc builder.

llvm-svn: 218308
2014-09-23 14:41:02 +00:00
NAKAMURA Takumi d103026c78 valgrind/x86_64-pc-linux-gnu.supp: We don't care if sed leaks.
llvm-svn: 218307
2014-09-23 14:19:09 +00:00
Timur Iskhodzhanov f6b889126c Fix a small typo in the test comment
llvm-svn: 218306
2014-09-23 14:07:12 +00:00
Sid Manning bd8bd484c3 Loop instead of individual def's for each GPR.
Differential Revision: http://reviews.llvm.org/D5450

llvm-svn: 218305
2014-09-23 13:55:50 +00:00
NAKAMURA Takumi 4d723baafc ExecutionEngineTests: Call llvm_shutdown() on exit for ManagedStatic introduced in r218151.
llvm-svn: 218304
2014-09-23 13:49:51 +00:00
Timur Iskhodzhanov d171153f81 Rebuild the inputs for the codeview-linetables.test with VS2013
Also provide reproducible instructions

llvm-svn: 218303
2014-09-23 13:49:51 +00:00
Petar Jovanovic 7480e4db5e Do not destroy external linkage when deleting function body
The function deleteBody() converts the linkage to external and thus destroys
original linkage type value. Lack of correct linkage type causes wrong
relocations to be emitted later.
Calling dropAllReferences() instead of deleteBody() will fix the issue.

Differential Revision: http://reviews.llvm.org/D5415

llvm-svn: 218302
2014-09-23 12:54:19 +00:00
Chandler Carruth 40592d2dec [x86] Teach the vector comment parsing and printing to correctly handle
undef in the shuffle mask. This shows up when we're printing comments
during lowering and we still have an IR-level constant hanging around
that models undef.

A nice consequence of this is *much* prettier test cases where the undef
lanes actually show up as undef rather than as a particular set of
values. This also allows us to print shuffle comments in cases that use
undef such as the recently added variable VPERMILPS lowering. Now those
test cases have nice shuffle comments attached with their details.

The shuffle lowering for PSHUFB has been augmented to use undef, and the
shuffle combining has been augmented to comprehend it.

llvm-svn: 218301
2014-09-23 11:15:19 +00:00
Chandler Carruth 6d5916a2d7 [x86] Teach the AVX1 path of the new vector shuffle lowering one more
trick that I missed.

VPERMILPS has a non-immediate memory operand mode that allows it to do
asymetric shuffles in the two 128-bit lanes. Use this rather than two
shuffles and a blend.

However, it turns out the variable shuffle path to VPERMILPS (and
VPERMILPD, although that one offers no functional differenc from the
immediate operand other than variability) wasn't even plumbed through
codegen. Do such plumbing so that we can reasonably emit
a variable-masked VPERMILP instruction. Also plumb basic comment parsing
and printing through so that the tests are reasonable.

There are still a few tests which don't show the shuffle pattern. These
are tests with undef lanes. I'll teach the shuffle decoding and printing
to handle undef mask entries in a follow-up. I've looked at the masks
and they seem reasonable.

llvm-svn: 218300
2014-09-23 10:08:29 +00:00
Michael Kuperstein 946b3b2e16 Ensure bitcode encoding stays stable.
This includes constants, attributes, and some additional instructions not covered by previous tests.

Work was done by lama.saba@intel.com.

llvm-svn: 218297
2014-09-23 08:48:01 +00:00
Argyrios Kyrtzidis a170697b18 [ADT/IntrusiveRefCntPtr] Give friend access to IntrusiveRefCntPtr<X> so the relevant move constructor can access 'Obj'.
llvm-svn: 218295
2014-09-23 06:06:43 +00:00
NAKAMURA Takumi bbae11bd2d Windows/DynamicLibrary.inc: Remove 'extern "C"' in ELM_Callback.
'extern "C" static' is not accepted by g++-4.7. Rather to tweak, I just removed 'extern "C"', since it doesn't affect the ABI.

llvm-svn: 218290
2014-09-23 01:09:46 +00:00
Sanjay Patel 4bc685c206 tighten up checks
We manage to generate all of the matching instructions (and a lot more) via
the reciprocal optimization function - even if we completely remove the square
root optimization. With CHECK_NEXT, we assure that we're executing the
expected square root optimization paths and not generating extra insts.

llvm-svn: 218284
2014-09-22 22:46:44 +00:00
Chris Bieneman fa35e11a7b Converting terminalHasColors mutex to a global ManagedStatic to avoid the static destructor.
llvm-svn: 218283
2014-09-22 22:39:20 +00:00
Chandler Carruth ed5dfff865 [x86] Rename X86ISD::VPERMILP to X86ISD::VPERMILPI (and the same for the
td pattern). Currently we only model the immediate operand variation of
VPERMILPS and VPERMILPD, we should make that clear in the pseudos used.
Will be adding support for the variable mask variant in my next commit.

llvm-svn: 218282
2014-09-22 22:29:42 +00:00
Kaelyn Takata cecdff6512 Fix a "typo" from my previous commit.
llvm-svn: 218281
2014-09-22 22:17:59 +00:00
Kaelyn Takata ba0a1e0520 Silence unused variable warnings in the new stub functions that occur
when assertions are disabled.

llvm-svn: 218280
2014-09-22 22:14:13 +00:00
Sanjay Patel 5cf7561d21 remove unnecessary labels; NFC
llvm-svn: 218278
2014-09-22 21:52:53 +00:00
Chandler Carruth 252debeb0b [x86] Stub out the integer lowering of 256-bit vectors with AVX2
support. No interesting functionality yet, but this will let me
implement one vector type at a time.

llvm-svn: 218277
2014-09-22 21:45:57 +00:00
Yaron Keren fb06908989 In this callback ModuleName includes the file path.
Comparing ModuleName to the file names listed will 
always fail. 

I wonder how this code ever worked and what its 
purpose was. Why exclude the msvc runtime DLLs
but not exclude all Windows system DLLs?

Anyhow, it does not function as intended.

clang-formatted as well.

llvm-svn: 218276
2014-09-22 21:40:15 +00:00
Juergen Ributzka 27e959d7b2 [FastISel][AArch64] Also allow folding of sign-/zero-extend and shift-left for booleans (i1).
Shift-left immediate with sign-/zero-extensions also works for boolean values.
Update the assert and the test cases to reflect that fact.

This should fix a bug found by Chad.

llvm-svn: 218275
2014-09-22 21:08:53 +00:00
Ehsan Akhgari bb6bb07d18 ms-inline-asm: Fix parsing label names inside bracket expressions
Summary:
This fixes a couple of issues.  One is ensuring that AOK_Label rewrite
rules have a lower priority than AOK_Skip rules, as AOK_Skip needs to
be able to skip the brackets properly.  The other part of the fix ensures
that we don't overwrite Identifier when looking up the identifier, and
that we use the locally available information to generate the AOK_Label
rewrite in ParseIntelIdentifier.  Doing that in CreateMemForInlineAsm
would be problematic since the Start location there may point to the
beginning of a bracket expression, and not necessarily the beginning of
an identifier.

This also means that we don't need to carry around the InternlName field,
which helps simplify the code.

Test Plan: This will be tested on the clang side.

Reviewers: rnk

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D5445

llvm-svn: 218270
2014-09-22 20:40:36 +00:00
David Majnemer 597be2ded6 MC: ReadOnlyWithRel section kinds should map to rdata in COFF
Don't consider ReadOnlyWithRel as a writable section in COFF, they
really belong in .rdata.

llvm-svn: 218268
2014-09-22 20:39:23 +00:00
Chandler Carruth 44deb8015c [x86] Introduce tests covering the gamut of 256-bit vector shuffling.
These are just test cases, no actual code yet. This establishes the
baseline fallback strategy we're starting from on AVX2 and the expected
lowering we use on AVX1.

Also, these test cases are very much generated. I've manually crafted
the specific pattern set that I'm hoping will be useful at exercising
the lowering code, but I've not (and could not) manually verify *all* of
these. I've spot checked and they seem legit to me.

As with the rest of vector shuffling, at a certain point the only really
useful way to check the correctness of this stuff is through fuzz
testing.

llvm-svn: 218267
2014-09-22 20:25:08 +00:00
Ehsan Akhgari 025ce8652f Make MCAsmParserSemaCallback::LookupInlineAsmLabel a pure virtual function
Summary:
r218229 made this function return a dummy nullptr in order to avoid
API breakage between clang/llvm.

Reviewers: rnk

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D5432

llvm-svn: 218266
2014-09-22 19:49:07 +00:00
Sanjay Patel 7939d7229d Use broadcasts to optimize overall size when loading constant splat vectors (x86-64 with AVX or AVX2).
We generate broadcast instructions on CPUs with AVX2 to load some constant splat vectors.
This patch should preserve all existing behavior with regular optimization levels, 
but also use splats whenever possible when optimizing for *size* on any CPU with AVX or AVX2.

The tradeoff is up to 5 extra instruction bytes for the broadcast instruction to save
at least 8 bytes (up to 31 bytes) of constant pool data.

Differential Revision: http://reviews.llvm.org/D5347

llvm-svn: 218263
2014-09-22 18:54:01 +00:00
Akira Hatanaka f2a721a875 Fix test case commited in r218242 to appease buildbot.
llvm-svn: 218261
2014-09-22 18:07:20 +00:00
Tom Stellard 9f73851e39 Revert "R600/SI: Add support for global atomic add"
This reverts commit r218254.

The global_atomics.ll test fails with asserts disabled.  For some reason,
the compiler fails to produce the atomic no return variants.

llvm-svn: 218257
2014-09-22 16:44:04 +00:00
Frederic Riss 220fa48491 Fix a test introduced in r218246 to work also on Windows.
llvm-svn: 218255
2014-09-22 16:17:32 +00:00
Tom Stellard 2355a77e74 R600/SI: Add support for global atomic add
llvm-svn: 218254
2014-09-22 15:35:35 +00:00
Tom Stellard 5a9a61ed7d R600/SI: Remove modifier operands from V_CNDMASK_B32_e64
Modifiers don't work for this instruction.

llvm-svn: 218253
2014-09-22 15:35:34 +00:00
Tom Stellard c9965f4186 R600: Don't set BypassSlowDiv for 64-bit division
BypassSlowDiv is used by codegen prepare to insert a run-time
check to see if the operands to a 64-bit division are really 32-bit
values and if they are it will do 32-bit division instead.

This is not useful for R600, which has predicated control flow since
both the 32-bit and 64-bit paths will be executed in most cases.  It
also increases code size which can lead to more instruction cache
misses.

llvm-svn: 218252
2014-09-22 15:35:32 +00:00
Tom Stellard 4349b19efb R600/SI: Use ISD::MUL instead of ISD::UMULO when lowering division
ISD::MUL and ISD:UMULO are the same except that UMULO sets an overflow
bit.  Since we aren't using the overflow bit, we should use ISD::MUL.

llvm-svn: 218251
2014-09-22 15:35:30 +00:00
Tom Stellard ec2e43c073 R600/SI: Add enums for some hard-coded values
llvm-svn: 218250
2014-09-22 15:35:29 +00:00
Pavel Chupin be9f12102f [x32] Fix segmented stacks support
Summary:
Update segmented-stacks*.ll tests with x32 target case and make
corresponding changes to make them pass.

Test Plan: tests updated with x32 target

Reviewers: nadav, rafael, dschuff

Subscribers: llvm-commits, zinovy.nis

Differential Revision: http://reviews.llvm.org/D5245

llvm-svn: 218247
2014-09-22 13:11:35 +00:00
Frederic Riss 955724e3f5 [dwarfdump] Dump full filenames as DW_AT_(decl|call)_file attribute values
Reviewers: dblaikie samsonov

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D5192

llvm-svn: 218246
2014-09-22 12:36:04 +00:00
Frederic Riss 58ed53cfcd Allow DWARFDebugInfoEntryMinimal::getSubroutineName to resolve cross-unit references.
Summary: getSubroutineName is currently only used by llvm-symbolizer, thus add a binary test containing a cross-cu inlining example.

Reviewers: samsonov, dblaikie

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D5394

llvm-svn: 218245
2014-09-22 12:35:53 +00:00
Robert Lougher 6da8a243f9 Fix assert when decoding PSHUFB mask
The PSHUFB mask decode routine used to assert if the mask index was out of
range (<0 or greater than the size of the vector).  The problem is, we can
legitimately have a PSHUFB with a large index using intrinsics.  The
instruction only uses the least significant 4 bits.  This change removes the
assert and masks the index to match the instruction behaviour.

llvm-svn: 218242
2014-09-22 11:54:38 +00:00
Oliver Stannard 14f97d0017 Downgrade DWARF2 section limit error to a warning
We currently emit an error when trying to assemble a file with more
than one section using DWARF2 debug info. This should be a warning
instead, as the resulting file will still be usable, but with a
degraded debug illusion.

llvm-svn: 218241
2014-09-22 10:45:16 +00:00
Hal Finkel b152ac5890 Update comment on AtomicRMWInst::Nand
As of July 2014, all backends have been updated to implement
AtomicRMWInst::Nand as ~(x & y) (and not as x & ~y, as some did previously).
This was added to the release notes in r212635 (and the LangRef had been
changed), but it seems that we forgot to update the header-file description.

llvm-svn: 218236
2014-09-22 06:47:10 +00:00
Chandler Carruth 7158c95d65 [x86] Move the AVX v4i64 test cases down to group them together.
Increasingly I don't want to mix the integer and floating point tests,
especially with AVX where they are handled quite differently.

llvm-svn: 218233
2014-09-22 03:05:23 +00:00
Jiangning Liu cd1d79e77c Add two thresholds lvi-overdefined-BB-threshold and lvi-overdefined-threshold
for LVI algorithm. For a specific value to be lowered, when the number of basic
blocks being checked for overdefined lattice value is larger than
lvi-overdefined-BB-threshold, or the times of encountering overdefined value
for a single basic block is larger than lvi-overdefined-threshold, the LVI
algorithm will stop further lowering the lattice value.

llvm-svn: 218231
2014-09-22 02:23:05 +00:00
Ehsan Akhgari db0e7061c6 ms-inline-asm: Add a sema callback for looking up label names
The implementation of the callback in clang's Sema will return an
internal name for labels.

Test Plan: Will be tested in clang.

Reviewers: rnk

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D4587

llvm-svn: 218229
2014-09-22 02:21:35 +00:00
Chandler Carruth 12bbf7d922 [x86] Back out a bad choice about lowering v4i64 and pave the way for
a more sane approach to AVX2 support.

Fundamentally, there is no useful way to lower integer vectors in AVX.
None. We always end up with a VINSERTF128 in the end, so we might as
well eagerly switch to the floating point domain and do everything
there. This cleans up lots of weird and unlikely to be correct
differences between integer and floating point shuffles when we only
have AVX1.

The other nice consequence is that by doing things this way we will make
it much easier to write the integer lowering routines as we won't need
to duplicate the logic to check for AVX vs. AVX2 in each one -- if we
actually try to lower a 256-bit vector as an integer vector, we have
AVX2 and can rely on it. I think this will make the code much simpler
and more comprehensible.

Currently, I've disabled *all* support for AVX2 so that we always fall
back to AVX. This keeps everything working rather than asserting. That
will go away with the subsequent series of patches that provide
a baseline AVX2 implementation.

Please note, I'm going to implement AVX2 *without access to hardware*.
That means I cannot correctness test this path. I will be relying on
those with access to AVX2 hardware to do correctness testing and fix
bugs here, but as a courtesy I'm trying to sketch out the framework for
the new-style vector shuffle lowering in the context of the AVX2 ISA.

llvm-svn: 218228
2014-09-22 00:32:15 +00:00
Chandler Carruth 5d45962b2c [x86] Teach the new vector shuffle lowering how to cleverly lower single
input v8f32 shuffles which are not 128-bit lane crossing but have
different shuffle patterns in the low and high lanes. This removes most
of the extract/insert traffic that was unnecessary and is particularly
good at lowering cases where only one of the two lanes is shuffled at
all.

I've also added a collection of test cases with undef lanes because this
lowering is somewhat more sensitive to undef lanes than others.

llvm-svn: 218226
2014-09-21 23:46:13 +00:00
Chandler Carruth b195e860f9 [x86] Add a bunch of test cases where we have different shuffle patterns
in the high and low 128-bit lanes of a v8f32 vector.

No functionality change yet, but wanted to set up the baseline for my
next patch which will make these quite a bit better. =]

llvm-svn: 218224
2014-09-21 23:32:42 +00:00
Matt Arsenault a9627ae97a Fix typo
llvm-svn: 218223
2014-09-21 17:27:32 +00:00
Matt Arsenault 393366c691 Use llvm_unreachable instead of assert(!)
llvm-svn: 218222
2014-09-21 17:27:31 +00:00
Matt Arsenault 3673eba568 R600/SI: Don't use strings for single characters
llvm-svn: 218221
2014-09-21 17:27:28 +00:00
Lang Hames 27e58727d3 Remove redundant if test.
llvm-svn: 218220
2014-09-21 17:21:56 +00:00
Sanjay Patel b67bd262ea Refactor reciprocal square root estimate into target-independent function; NFC.
This is purely a plumbing patch. No functional changes intended.

The ultimate goal is to allow targets other than PowerPC (certainly X86 and Aarch64) to turn this:

z = y / sqrt(x)

into:

z = y * rsqrte(x)

using whatever HW magic they can use. See http://llvm.org/bugs/show_bug.cgi?id=20900 .

The first step is to add a target hook for RSQRTE, take the already target-independent code selfishly hoarded by PPC, and put it into DAGCombiner.

Next steps:

    The code in DAGCombiner::BuildRSQRTE() should be refactored further; tests that exercise that logic need to be added.
    Logic in PPCTargetLowering::BuildRSQRTE() should be hoisted into DAGCombiner.
    X86 and AArch64 overrides for TargetLowering.BuildRSQRTE() should be added.

Differential Revision: http://reviews.llvm.org/D5425

llvm-svn: 218219
2014-09-21 15:19:15 +00:00
Sanjay Patel d649235fc3 mop up: "Don’t duplicate function or class name at the beginning of the comment."
llvm-svn: 218218
2014-09-21 14:48:16 +00:00
Chandler Carruth 215037e35d [x86] With the stronger canonicalization of shuffles added in r218216,
the new vector shuffle lowering no longer needs to check both symmetric
forms of UNPCK patterns for v4f64.

llvm-svn: 218217
2014-09-21 13:37:51 +00:00
Chandler Carruth b3125c7522 [x86] Teach the new vector shuffle lowering to re-use the SHUFPS
lowering when it can use a symmetric SHUFPS across both 128-bit lanes.

This required making the SHUFPS lowering tolerant of other vector types,
and adjusting our canonicalization to canonicalize harder.

This is the last of the clever uses of symmetry I've thought of for
v8f32. The rest of the tricks I'm aware of here are to work around
assymetry in the mask.

llvm-svn: 218216
2014-09-21 13:35:14 +00:00
Chandler Carruth 02f3554971 [x86] Refactor the logic to form SHUFPS instruction patterns to lower
a generic vector shuffle mask into a helper that isn't specific to the
other things that influence which choice is made or the specific types
used with the instruction.

No functionality changed.

llvm-svn: 218215
2014-09-21 13:03:00 +00:00
Chandler Carruth 33eda72802 [x86] Teach the new vector shuffle lowering the basics about insertion
of a single element into a zero vector for v4f64 and v4i64 in AVX.
Ironically, there is less to see here because xor+blend is so crazy fast
that we can't really beat that to zero the high 128-bit lane.

llvm-svn: 218214
2014-09-21 12:49:46 +00:00
Chandler Carruth 43f5974ea0 [x86] Teach the new vector shuffle lowering how to lower to UNPCKLPS and
UNPCKHPS with AVX vectors by recognizing those patterns when they are
repeated for both 128-bit lanes.

With this, we now generate the exact same (really nice) code for
Quentin's avx_test_case.ll which was the most significant regression
reported for the new shuffle lowering. In fact, I'm out of specific test
cases for AVX lowering, the rest were AVX2 I think. However, there are
a bunch of pretty obvious remaining things to improve with AVX...

llvm-svn: 218213
2014-09-21 12:20:44 +00:00
Chandler Carruth 78f4798913 [x86] Add test cases for UNPCK instructions with v8f32 AVX vectors in
preparation for enhancing their support in the new vector shuffle
lowering.

llvm-svn: 218212
2014-09-21 12:13:11 +00:00
Chandler Carruth 88404c4f9b [x86] Begin teaching the new vector shuffle lowering among the most
important bits of cleverness: to detect and lower repeated shuffle
patterns between the two 128-bit lanes with a single instruction.

This patch just teaches it how to lower single-input shuffles that fit
this model using VPERMILPS. =] There is more that needs to happen here.

llvm-svn: 218211
2014-09-21 12:01:19 +00:00
Chandler Carruth 83252ac8f4 [x86] Regenerate this test case now that I've improved my script for
generating the test cases to format things more consistently and
actually catch all the operand sequences that should be elided in favor
of the asm comments. No actual changes here.

llvm-svn: 218210
2014-09-21 11:51:33 +00:00
Chandler Carruth 3dccabaf35 [x86] Explicitly lower to a blend early if it is trivial to do so for
v8f32 shuffles in the new vector shuffle lowering code.

This is very cheap to do and makes it much more clear that anything more
expensive but overlapping with this lowering should be selected
afterward (for example using AVX2's VPERMPS). However, no functionality
changed here as without this code we would fall through to create no-op
shuffles of each input and a blend. =]

llvm-svn: 218209
2014-09-21 11:40:39 +00:00
Chandler Carruth e81bfbada9 [x86] Teach the new vector shuffle lowering of v4f64 to prefer a direct
VBLENDPD over using VSHUFPD. While the 256-bit variant of VBLENDPD slows
down to the same speed as VSHUFPD on Sandy Bridge CPUs, it has twice the
reciprocal throughput on Ivy Bridge CPUs much like it does everywhere
for 128-bits. There isn't a downside, so just eagerly use this
instruction when it suffices.

llvm-svn: 218208
2014-09-21 11:17:55 +00:00
Chandler Carruth 6aea21df8e [x86] Add some more comprehensive tests for v4f64 blending.
llvm-svn: 218207
2014-09-21 11:12:19 +00:00
Chandler Carruth 908afb56c0 [x86] Re-generate a bunch of the v4f64 test cases with my new script.
This expands the integer cases to cover the fact that AVX2 moves their
lane-crossing shuffles into the integer domain. It also adds proper
support for AVX2 run lines and the "ALL" group when it doesn't matter.

llvm-svn: 218206
2014-09-21 11:07:41 +00:00
Chandler Carruth 8d0a1b209b [x86] Switch the blend implementation to use a MVT switch rather than
awkward conditions. The readability improvement of this will be even
more important as I generalize it to handle more types.

No functionality changed.

llvm-svn: 218205
2014-09-21 10:36:12 +00:00
Chandler Carruth f098cee2e3 [x86] Remove some essentially lying comments from the v4f64 path of the
new vector shuffle lowering.

llvm-svn: 218204
2014-09-21 10:27:14 +00:00
Chandler Carruth a746d776eb [x86] Fix a helper to reflect that what we actually care about is
128-bit lane crossings, not 'half' crossings. This came up in code
review ages ago, but I hadn't really addresesd it. Also added some
documentation for the helper.

No functionality changed.

llvm-svn: 218203
2014-09-21 09:35:25 +00:00
Chandler Carruth 293327ddcd [x86] Teach the new vector shuffle lowering the first step toward more
actual support for complex AVX shuffling tricks. We can do independent
blends of the low and high 128-bit lanes of an avx vector, so shuffle
the inputs into place and then do the blend at 256 bits. This will in
many cases remove one blend instruction.

The next step is to permute the low and high halves in-place rather than
extracting them and re-inserting them.

llvm-svn: 218202
2014-09-21 09:35:22 +00:00
David Majnemer 48227a3759 MC: Support aligned COMMON symbols for COFF
link.exe:
Fuzz testing has shown that COMMON symbols with size > 32 will always
have an alignment of at least 32 and all symbols with size < 32 will
have an alignment of at least the largest power of 2 less than the size
of the symbol.

binutils:
The BFD linker essentially work like the link.exe behavior but with
alignment 4 instead of 32.  The BFD linker also supports an extension to
COFF which adds an -aligncomm argument to the .drectve section which
permits specifying a precise alignment for a variable but MC currently
doesn't support editing .drectve in this way.

With all of this in mind, we decide to play a little trick: we can
ensure that the alignment will be respected by bumping the size of the
global to it's alignment.

llvm-svn: 218201
2014-09-21 09:18:07 +00:00
Chandler Carruth 8ff73c0170 [x86] Add some more test cases covering specific blend patterns.
llvm-svn: 218200
2014-09-21 09:01:26 +00:00
Chandler Carruth 7a6108d652 [x86] Add the beginnings of some tests for our v8f32 shuffle lowering
under AVX.

This really just documents the current state of the world. I'm going to
try to flesh it out to cover any test cases I plan to improve prior to
improving them so that the delta made by changes is actually visible to
code reviewers.

This is made easier by the fact that I now have a script to automate the
process of producing test cases including the check lines. =]

llvm-svn: 218199
2014-09-21 08:49:27 +00:00
NAKAMURA Takumi 4b21bac5fd RTDyldMemoryManager::getSymbolAddress(): Make sure to return 0 if symbol name is not met. [-Wreturn-type]
llvm-svn: 218195
2014-09-20 23:58:13 +00:00
Sanjay Patel 69df41e92e mop up: "Don’t duplicate function or class name at the beginning of the comment."
llvm-svn: 218194
2014-09-20 22:39:16 +00:00
Chandler Carruth a454812ac8 [x86] Teach the new vector shuffle lowering to use VPERMILPD for
single-input shuffles with doubles. This allows them to fold memory
operands into the shuffle, etc. This is just the analog to the v4f32
case in my prior commit.

llvm-svn: 218193
2014-09-20 22:09:27 +00:00
Chandler Carruth aa5b798ae7 [x86] Add an AVX run to the 128-bit v2 tests, teach them to have
a generic SSE and AVX mode in addition to a specific AVX1 test path, and
flesh out the AVX tests.

llvm-svn: 218192
2014-09-20 21:26:41 +00:00
David Majnemer fb83977538 Update tests which broke from r218189
llvm-svn: 218191
2014-09-20 21:18:43 +00:00
Chandler Carruth 6f80abac4e [x86] Teach the new vector shuffle lowering to use the AVX VPERMILPS
instruction for single-vector floating point shuffles. This in turn
allows the shuffles to fold a load into the instruction which is one of
the common regressions hit with the new shuffle lowering.

llvm-svn: 218190
2014-09-20 20:52:07 +00:00
David Majnemer 7d0dc3ef18 MC: Fix MCSectionCOFF::PrintSwitchToSection
We had a few bugs:
- We were considering the GVKind instead of just looking at the section
  characteristics
- We would never print out 'y' when a section was meant to be unreadable
- We would never print out 's' when a section was meant to be shared
- We translated IMAGE_SCN_MEM_DISCARDABLE to 'n' when it should've meant
  IMAGE_SCN_LNK_REMOVE

llvm-svn: 218189
2014-09-20 20:40:50 +00:00
Chandler Carruth 78a761ce8c [x86] Start moving to a fancier check syntax to reduce the need for
duplication of check lines. The idea is to have broad sets of
compilation modes that will frequently diverge without having to always
and immediately explode to the precise ISA feature set.

While this already helps due to VEX encoded differences, it will help
much more as I teach the new shuffle lowering about more of the new VEX
encoded instructions which can still be used to implement 128-bit
shuffles.

llvm-svn: 218188
2014-09-20 18:36:39 +00:00
Lang Hames b7fbf593e6 [MCJIT] Make RTDyldMemoryManager::getSymbolAddress's behaviour more consistent.
This patch modifies RTDyldMemoryManager::getSymbolAddress(Name)'s behavior to
make it consistent with how clients are using it: Name should be mangled, and
getSymbolAddress should demangle it on the caller's behalf before looking the
name up in the process. This patch also fixes the one client
(MCJIT::getPointerToFunction) that had been passing unmangled names (by having
it pass mangled names instead).

Background:

RTDyldMemoryManager::getSymbolAddress(Name) has always used a re-try mechanism
when looking up symbol names in the current process. Prior to this patch
getSymbolAddress first tried to look up 'Name' exactly as the user passed it in
and then, if that failed, tried to demangle 'Name' and re-try the look up. The
implication of this behavior is that getSymbolAddress expected to be called with
unmangled names, and that handling mangled names was a fallback for convenience.
This is inconsistent with how clients (particularly the RuntimeDyldImpl
subclasses, but also MCJIT) usually use this API. Most clients pass in mangled
names, and succeed only because of the fallback case. For clients passing in
mangled names, getSymbolAddress's old behavior was actually dangerous, as it
could cause unmangled names in the process to shadow mangled names being looked
up.

For example, consider:

foo.c:

int _x = 7;
int x() { return _x; }

foo.o:

000000000000000c D __x
0000000000000000 T _x

If foo.c becomes part of the process (E.g. via dlopen("libfoo.dylib")) it will
add symbols 'x' (the function) and '_x' (the variable) to the process. However
jit clients looking for the function 'x' will be using the mangled function name
'_x' (note how function 'x' appears in foo.o). When getSymbolAddress goes
looking for '_x' it will find the variable instead, and return its address and
in place of the function, leading to JIT'd code calling the variable and
crashing (if we're lucky).

By requiring that getSymbolAddress be called with mangled names, and demangling
only when we're about to do a lookup in the process, the new behavior
implemented in this patch should eliminate any chance of names being shadowed
during lookup.

There's no good way to test this at the moment: This issue only arrises when
looking up process symbols (not JIT'd symbols). Any test case would have to
generate a platform-appropriate dylib to pass to llvm-rtdyld, and I'm not
aware of any in-tree tool for doing this in a portable way.

llvm-svn: 218187
2014-09-20 17:44:56 +00:00
Justin Bogner 19a93ba814 llvm-cov: Allow creating CoverageMappings from filenames
llvm-svn: 218185
2014-09-20 17:19:52 +00:00
Justin Bogner 953e2407ed llvm-cov: Disentangle the coverage data logic from the display (NFC)
This splits the logic for actually looking up coverage information
from the logic that displays it. These were tangled rather thoroughly
so this change is a bit large, but it mostly consists of moving things
around. The coverage lookup logic itself now lives in the library,
rather than being spread between the library and the tool.

llvm-svn: 218184
2014-09-20 15:31:56 +00:00
Justin Bogner f584649ae3 llvm-cov: Move some reader debug output out of the tool.
This debug output is really for testing CoverageMappingReader, not the
llvm-cov tool. Move it to where it can be more useful.

llvm-svn: 218183
2014-09-20 15:31:51 +00:00
Lenny Maiorani 9eefc81219 Using a deque to manage the stack of nodes is faster here.
Vector is slow due to many reallocations as the size regularly changes in
  unpredictable ways. See the investigation provided on the mailing list for
  more information:

http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20120116/135228.html

llvm-svn: 218182
2014-09-20 13:29:20 +00:00
David Majnemer b8dbebb31c MC: Treat ReadOnlyWithRel and ReadOnlyWithRelLocal as ReadOnly for COFF
A problem with our old behavior becomes observable under x86-64 COFF
when we need a read-only GV which has an initializer which is referenced
using a relocation: we would mark the section as writable.  Marking the
section as writable interferes with section merging.

This fixes PR21009.

llvm-svn: 218179
2014-09-20 07:31:46 +00:00
Chandler Carruth 8c4cccd4aa [x86] Teach the v4f32 path of the new shuffle lowering to handle the
tricky case of single-element insertion into the zero lane of a zero
vector.

We can't just use the same pattern here as we do in every other vector
type because the general insertion logic can handle insertion into the
non-zero lane of the vector. However, in SSE4.1 with v4f32 vectors we
have INSERTPS that is a much better choice than the generic one for such
lowerings. But INSERTPS can do lots of other lowerings as well so
factoring its logic into the general insertion logic doesn't work very
well. We also can't just extract the core common part of the general
insertion logic that is faster (forming VZEXT_MOVL synthetic nodes that
lower to MOVSS when they can) because VZEXT_MOVL is often *faster* than
a blend while INSERTPS is slower! So instead we do a restrictive
condition on attempting to use the generic insertion logic to narrow it
to those cases where VZEXT_MOVL won't need a shuffle afterward and thus
will do better than INSERTPS. Then we try blending. Then we go back to
INSERTPS.

This still doesn't generate perfect code for some silly reasons that can
be fixed by tweaking the td files for lowering VZEXT_MOVL to use
XORPS+BLENDPS when available rather than XORPS+MOVSS when the input ends
up in a register rather than a load from memory -- BLENDPSrr has twice
the reciprocal throughput of MOVSSrr. Don't you love this ISA?

llvm-svn: 218177
2014-09-20 04:15:22 +00:00
Chandler Carruth 87dcf09367 [x86] Refactor the code for emitting INSERTPS to reuse the zeroable mask
analysis used elsewhere. This removes the last duplicate of this logic.
Also simplify the code here quite a bit. No functionality changed.

llvm-svn: 218176
2014-09-20 03:57:01 +00:00
Chandler Carruth 00389f3ed9 [x86] Generalize the single-element insertion lowering to work with
floating point types and use it for both v2f64 and v2i64 single-element
insertion lowering.

This fixes the last non-AVX performance regression test case I've gotten
of for the new vector shuffle lowering. There is obvious analogous
lowering for v4f32 that I'll add in a follow-up patch (because with
INSERTPS, v4f32 requires special treatment). After that, its AVX stuff.

llvm-svn: 218175
2014-09-20 03:32:25 +00:00
Chandler Carruth dba8444c2a [x86] Replace some duplicated logic reasoning about whether particular
vector lanes can be modeled as zero with a call to the new function that
computes a bit-vector representing that information.

No functionality changed here, but will allow doing more clever things
with the zero-test.

llvm-svn: 218174
2014-09-20 02:44:21 +00:00
David Majnemer f4dc456eef llvm-readobj: pretty-print special COFF section names
Print IMAGE_SYM_DEBUG and the like instead of (-2).

llvm-svn: 218172
2014-09-20 00:25:06 +00:00
Peter Collingbourne 975726345c Fix crash with an insertvalue that produces an empty object.
llvm-svn: 218171
2014-09-20 00:10:47 +00:00
Robin Morisset d780781b1f [X86] Erase some obsolete comments from README.txt
I just tried reproducing some of the optimization failures in README.txt in the
X86 backend, and many of them could not be reproduced. In general the entire
file appears quite bit-rotted, whatever interesting parts remain should be
moved to bugzilla, and the rest deleted. I did not spend the time to do that,
so I just deleted the few I tried reproducing which are obsolete, to save some
time to whoever will find the courage to do it.

llvm-svn: 218170
2014-09-19 23:56:46 +00:00
Eric Christopher b152660075 constify the TargetMachine being passed through the Mips subtarget
creation.

llvm-svn: 218169
2014-09-19 23:30:42 +00:00
Chris Bieneman 1efe80109a Converting InstrProf's error_category to a ManagedStatic to avoid static constructors and destructors.
llvm-svn: 218168
2014-09-19 23:19:24 +00:00
Duncan P. N. Exon Smith 75b64a551d DIBuilder: Delete dead code, NFC
There are two versions of `DIBuilder::createObjCIVar()`.  Delete the one
that's apparently dead.

llvm-svn: 218167
2014-09-19 23:17:58 +00:00
Matt Arsenault de0253791c R600: Un-xfail a test which passes with pass disabled
llvm-svn: 218165
2014-09-19 23:02:20 +00:00
Matt Arsenault 5e5b242946 R600/SI: Un-xfail tests which work now
llvm-svn: 218164
2014-09-19 23:02:18 +00:00
Chris Bieneman 1a98490ce5 Converting SpillPlacement's BlockFrequency threshold to a ManagedStatic to avoid static constructors and destructors.
llvm-svn: 218163
2014-09-19 22:46:28 +00:00
Matt Arsenault a986554377 R600/SI: Un xfail a test that works now
llvm-svn: 218162
2014-09-19 22:42:40 +00:00
Juergen Ributzka 92e8978e40 [FastIsel][AArch64] Fix a think-o in address computation.
When looking through sign/zero-extensions the code would always assume there is
such an extension instruction and use the wrong operand for the address.

There was also a minor issue in the handling of 'AND' instructions. I
accidentially used a 'cast' instead of a 'dyn_cast'.

llvm-svn: 218161
2014-09-19 22:23:46 +00:00
Chris Bieneman 684981e454 Converting object's error_category to a ManagedStatic to avoid static constructors and destructors.
llvm-svn: 218160
2014-09-19 22:09:18 +00:00
Chandler Carruth a6b7178b9d [x86] Hoist a function up to the rest of the non-type-specific lowering
helpers, and re-flow the logic to use early exit and be a bit more
readable.

No functionality changed.

llvm-svn: 218155
2014-09-19 21:52:10 +00:00
Chris Bieneman 194cdff88e Converting the JITDebugLock mutex to a ManagedStatic to avoid the static constructor and destructor.
llvm-svn: 218154
2014-09-19 21:38:20 +00:00
Chandler Carruth f85c6dfa45 [x86] Hoist the actual lowering logic into a helper function to separate
it from the shuffle pattern matching logic.

Also cleaned up variable names, comments, etc. No functionality changed.

llvm-svn: 218152
2014-09-19 21:20:08 +00:00
Chris Bieneman 3b1fd57e05 Converting FuncNames to a ManagedStatic to avoid static constructors and destructors.
llvm-svn: 218151
2014-09-19 21:07:01 +00:00
Tom Stellard ff795900eb R600/SI: Fix config value for number of gprs
In r217636, the value stored in KernelInfo.Num[VS]GPRSs was changed from
the highest GPR index used to the number of gprs in order to be
consistent with the name of the variable.

The code writing the config values still assumed that the value in this
variable was the highest GPR index used, which caused the compiler to
over report the number of GPRs being used.

https://bugs.freedesktop.org/show_bug.cgi?id=84089

llvm-svn: 218150
2014-09-19 20:42:37 +00:00
Chris Bieneman 770163e4f3 Eliminating static destructor for the BitCodeErrorCategory by converting to a ManagedStatic.
Summary: This is part of the overall goal of removing static initializers from LLVM.

Reviewers: chandlerc

Reviewed By: chandlerc

Subscribers: chandlerc, llvm-commits

Differential Revision: http://reviews.llvm.org/D5416

llvm-svn: 218149
2014-09-19 20:29:02 +00:00
Chandler Carruth 0fc0c22fa9 [x86] Fully generalize the zext lowering in the new vector shuffle
lowering to support both anyext and zext and to custom lower for many
different microarchitectures.

Using this allows us to get *exactly* the right code for zext and anyext
shuffles in all the vector sizes. For v16i8, the improvement is *huge*.
The new SSE2 test case added I refused to add before this because it was
sooooo muny instructions.

llvm-svn: 218143
2014-09-19 20:00:32 +00:00
Matt Arsenault 3f9b021c00 Add hsail and amdil64 to Triple
llvm-svn: 218142
2014-09-19 19:52:11 +00:00
Justin Bogner 5a6edad3d8 llvm-cov: Return unique_ptrs instead of filling objects (NFC)
Having create* functions return the object they create is more
readable than using an in-out parameter.

llvm-svn: 218139
2014-09-19 19:07:17 +00:00
Justin Bogner a829fde160 llvm-cov: Prevent a test from matching its own check lines
Since llvm-cov shows the source file in its output, be careful about
potentially matching the check lines themselves.

llvm-svn: 218138
2014-09-19 19:04:08 +00:00
Eric Christopher abc1297a70 Revert my earlier change to add "all" as a dependency to check. In
retrospect it really wasn't a good idea.

llvm-svn: 218136
2014-09-19 18:44:27 +00:00
David Blaikie db119544a2 Fix test case to be portable to different architectures.
llvm-svn: 218134
2014-09-19 18:31:25 +00:00
Matt Arsenault 4505f3a73d R600/SI: Fix test to prepare for scheduler
llvm-svn: 218131
2014-09-19 18:11:16 +00:00
David Blaikie 3a7ce252cc Omit DW_TAG_subprograms for subprograms without inlined subroutines when producing -gmlt data
To reduce the size of -gmlt data, skip the subprograms without any
inlined subroutines. Since we've now got the ability to make these
determinations in the backend (funnily enough - we added the flag so we
wouldn't produce ranges under -gmlt, but with this change we use the
flag, but go back to producing ranges under -gmlt).

Instead, just produce CU ranges to inform the consumer which parts of
the code are described by this CU's line table. Tools could inspect the
line table directly to compute the range, but the CU ranges only seem to
be about 0.5% of object/executable size, so I'm not too worried about
teaching llvm-symbolizer that trick just yet - it's certainly a possible
piece of future work.

Update an llvm-symbolizer test just to demonstrate that this schema is
acceptable there (if it wasn't, the compiler-rt tests would catch this,
but good to have an in-llvm-tree test for llvm-symbolizer's behavior
here)

Building the clang binary with -gmlt with this patch reduces the total
size of object files by 5.1% (5.56% without ranges) without compression
and the executable by 4.37% (4.75% without ranges).

llvm-svn: 218129
2014-09-19 17:03:16 +00:00
Frederic Riss 9ba9efff56 Change DwarfCompileUnit::createGlobalVariable to getOrCreateGlobalVariable.
Summary:
This will allow to request the creation of a forward delacred variable
at is point of use (for imported declarations, this will be
DwarfDebug::constructImportedEntityDIE) rather than having to put the
forward decl in a retention list.

Note that getOrCreateGlobalVariable returns the actual definition DIE when the
routine creates a declaration and a definition DIE. If you agree this is the
right behavior, then I'll have a followup patch that registers the definition
in the DIE map instead of the declaration as it is today (this 'breaks' only
one test, where we test that the imported entity is the declaration). I'm
not sure what's best here, but it's easy enough for a consumer to follow the
DW_AT_specification link to get to the declaration, whereas it takes more
work to find the actual definition from a declaration DIE.

Reviewers: echristo, dblaikie, aprantl

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D5381

llvm-svn: 218126
2014-09-19 15:12:03 +00:00
Frederic Riss 101b5e2053 Turn local DWARFContext helpers getFileNameForUnit() and getFileLineInfoForCompileUnit() into full-blowm DWARFDebugLine::LineTable methods.
Summary:
getFileNameForUnit() is basically a wrapper around LineTable::getFileNameByIndex().
Fold its additional functionality (adding the DWARFUnit compilation dir) into
LineTable::getFileNameByIndex().

getFileLineInfoForCompileUnit() is a wrapper around getFileNameForUnit(). As
a function to search the line information by address, it seems natural to put
it in the LineTable also.

Before this commit only the Context with its private helpers could do Linetable
lookups. This newly exposed feature will be used by the DIE dumping code to
get access to file information referenced in DIE attributes.

This commit has already been partly reviewed in D5192 and contained an
additional and a bit controversial 'realpath' call that is left out of this
patch. We can reinstate that realpath code later if it is desirable.

Test Plan:
The patch contains no tests as it should be functionally equivalent to the
previous code. As requested in the last review, I checked if the relative
path handling copied from the Context to LineTable::getFileNameByIndex()
was covered, and indeed the symbolizer tests fail if it is removed.

Reviewers: dblaikie, echristo, aprantl, samsonov

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D5354

llvm-svn: 218125
2014-09-19 15:11:51 +00:00
Benjamin Kramer 04f9da8f21 Elide unnecessary DenseMap copy.
No functionality change.

llvm-svn: 218122
2014-09-19 12:26:38 +00:00
Hal Finkel 62ac736faa Optionally enable more-aggressive FMA formation in DAGCombine
The heuristic used by DAGCombine to form FMAs checks that the FMUL has only one
use, but this is overly-conservative on some systems. Specifically, if the FMA
and the FADD have the same latency (and the FMA does not compete for resources
with the FMUL any more than the FADD does), there is no need for the
restriction, and furthermore, forming the FMA leaving the FMUL can still allow
for higher overall throughput and decreased critical-path length.

Here we add a new TLI callback, enableAggressiveFMAFusion, false by default, to
elide the hasOneUse check. This is enabled for PowerPC by default, as most
PowerPC systems will benefit.

Patch by Olivier Sallenave, thanks!

llvm-svn: 218120
2014-09-19 11:42:56 +00:00
Chandler Carruth 8a6536d4b2 [x86] Recognize that we can use duplication to widen v16i8 shuffles due
to undef lanes as well as defined widenable lanes. This dramatically
improves the lowering we use for undef-shuffles in a zext-ish pattern
for SSE2.

llvm-svn: 218115
2014-09-19 09:45:21 +00:00
Chandler Carruth 662b6d84e7 [x86] Actually test the SSE2 lowering for most of the zext-ish shuffles.
Not sure why I only did SSSE3 here. Also, I've left out some of the SSE2
ones because the shuffles are so absurd it's not worth transcribing
them. Will try to fix them to be sane and then check them.

llvm-svn: 218114
2014-09-19 08:51:06 +00:00
Chandler Carruth 2e275142cd [x86] Teach the new vector shuffle lowering to also use pmovzx for v4i32
shuffles that are zext-ing.

Not a lot to see here; the undef lane variant is better handled with
pshufd, but this improves the actual zext pattern.

llvm-svn: 218112
2014-09-19 08:37:44 +00:00
Justin Bogner 13ba23bb79 llvm-cov: Fix dropped lines when filters were applied
Uncovered lines in the middle of a covered region weren't being shown
when filtering to a particular function.

llvm-svn: 218109
2014-09-19 08:13:16 +00:00
Justin Bogner 116c16642d llvm-cov: Generalize -filename-equivalence
The filename-equivalence flag allows you to show coverage when your
source files don't have the same full paths as those that generated
the data. This is mostly useful for writing tests in a cross-platform
way.

This wasn't triggering in cases where the filename was derived
directly from the coverage data, which meant certain types of test
case were impossible to write. This patch fixes that, and following
patches involve tests that need this.

llvm-svn: 218108
2014-09-19 08:13:12 +00:00
Chandler Carruth 398ba9a018 [x86] Add a dedicated lowering path for zext-compatible vector shuffles
to the new vector shuffle lowering code.

This allows us to emit PMOVZX variants consistently for patterns where
it is a viable lowering. This instruction is both fast and allows us to
fold loads into it. This only hooks the new lowering up for i16 and i8
element widths, mostly so I could manage the change to the tests. I'll
add the i32 one next, although it is significantly less interesting.

One thing to note is that we already had some tests for these patterns
but those tests had far less horrible instructions. The problem is that
those tests weren't checking the strict start and end of the instruction
sequence. =[ As a consequence something changed in the lowering making
us generate *TERRIBLE* code for these patterns in SSE2 through SSSE3.
I've consolidated all of the tests and spelled out the madness that we
currently emit for these shuffles. I'm going to try to figure out what
has gone wrong here.

llvm-svn: 218102
2014-09-19 06:07:49 +00:00
Jiangning Liu ffbc690933 Optimize sext/zext insertion algorithm in back-end.
With this optimization, we will not always insert zext for values crossing
basic blocks, but insert sext if the users of a value crossing basic block
has preference of sign predicate.

llvm-svn: 218101
2014-09-19 05:30:35 +00:00
David Blaikie 03c3dbeb62 Omit DW_AT_frame_base under -gmlt for size
llvm-svn: 218100
2014-09-19 04:55:05 +00:00
David Blaikie 0b9438b1c1 Describe the -gmlt optimization committed in the previous revision.
llvm-svn: 218099
2014-09-19 04:47:46 +00:00
David Blaikie 73b65d236c Omit all the extra static attributes on subprograms in -gmlt
This omission will be done in a fancier manner once we're dealing with
"put gmlt in the skeleton CUs under fission" - it'll have to be
conditional on the kind of CU we're emitting into (skeleton or gmlt).

llvm-svn: 218098
2014-09-19 04:30:36 +00:00
Hans Wennborg c0f0c511db Fix an it's vs. its typo.
llvm-svn: 218093
2014-09-19 01:14:56 +00:00
Matt Arsenault 46cbc4367b R600: Better fix for bug 20982
Just do the left shift as unsigned to avoid the UB.

llvm-svn: 218092
2014-09-19 00:42:06 +00:00
Chandler Carruth be58fd2f2d [x86] Extend this test to cover SSE4.1. Nothing interesting here, but
paves the way for subsequent changes.

llvm-svn: 218091
2014-09-19 00:30:24 +00:00
Peter Collingbourne 6b433e4d46 Try to fix i686-cygming bots.
llvm-svn: 218086
2014-09-18 22:56:00 +00:00
Matt Arsenault c1728972e1 Use cast<> instead of unchecked dyn_cast<>
llvm-svn: 218085
2014-09-18 22:28:56 +00:00
Peter Collingbourne 042b7ffd37 Fix sphinx warning.
llvm-svn: 218081
2014-09-18 21:54:02 +00:00
Peter Collingbourne 10039c02ea LTO: introduce object file-based on-disk module format.
This format is simply a regular object file with the bitcode stored in a
section named ".llvmbc", plus any number of other (non-allocated) sections.

One immediate use case for this is to accommodate compilation processes
which expect the object file to contain metadata in non-allocated sections,
such as the ".go_export" section used by some Go compilers [1], although I
imagine that in the future we could consider compiling parts of the module
(such as large non-inlinable functions) directly into the object file to
improve LTO efficiency.

[1] http://golang.org/doc/install/gccgo#Imports

Differential Revision: http://reviews.llvm.org/D4371

llvm-svn: 218078
2014-09-18 21:28:49 +00:00
Quentin Colombet 17799fedb7 [ARM] Do not perform a tail call when the caller returns several values.
The fix is slightly different then x86 (see r216117) because the number of values
attached to a return can vary even for a single returned value (e.g., f64 yields
two returned values).

<rdar://problem/18352998>

llvm-svn: 218076
2014-09-18 21:17:50 +00:00
Justin Bogner 1ae21d1367 llvm-cov: Simplify FunctionInstantiationSetCollector (NFC)
- Replace std::unordered_map with DenseMap
- Use std::pair instead of manually combining two unsigneds
- Assert if insert is called with invalid arguments
- Avoid an unnecessary copy of a std::vector

llvm-svn: 218074
2014-09-18 20:31:26 +00:00
Robin Morisset 5349e8e532 Restore "[ARM, Fix] Fix emitLeading/TrailingFence on old ARM processors"
Summary:
This patch was originally in D5304 (I could not find a way to reopen that revision).
It was accepted, commited and broke the build bots because the overloading of
the constructor of ArrayRef for braced initializer lists is not supported by all
toolchains. I then reverted it, and propose this fixed version that uses a plain
C array instead in makeDMB (that array is then converted implicitly to an
ArrayRef, but that is not behind an ifdef). Could someone confirm me whether
initialization lists for plain C arrays are supported by every toolchain used
to build llvm ? Otherwise I can just initialize the array in the old way:
args[0] = ...; .. ; args[5] = ...;

Below is the description of the original patch:
```
I had only tested this code for ARMv7 and ARMv8. This patch adds several
fallback paths if the processor does not support dmb ish:
- dmb sy if a cortex-M with support for dmb
- mcr p15, #0, r0, c7, c10, #5 for ARMv6 (special instruction equivalent to a DMB)
These fallback paths were chosen based on the code for fence seq_cst.

Thanks to luqmana for having noticed this bug.
```

Test Plan: Added more cases to atomic-load-store.ll + make check-all

Reviewers: jfb, t.p.northover, luqmana

Subscribers: llvm-commits, aemerson

Differential Revision: http://reviews.llvm.org/D5386

llvm-svn: 218066
2014-09-18 18:56:04 +00:00
Aaron Ballman 0bb041b5f4 Reverting NFC changes from r218050. Instead, the warning was disabled for GCC in r218059, so these changes are no longer required.
llvm-svn: 218062
2014-09-18 17:34:23 +00:00
Lang Hames 8d4d0260a1 [MCJIT] Fix a debugging-output formatting bug in RuntimeDyld.
The mismatched mask (7 vs (ColsPerRow-1)) could lead to partial lines being
printed out of place.

llvm-svn: 218061
2014-09-18 16:43:24 +00:00
Frederic Riss 0baab0cded Revert part of r218041.
The patch moved some logic around in an attempt to generate potentially more
DW_AT_declaration attributes. The patch was flawed though and it stopped
generating the attribute in some cases.

llvm-svn: 218060
2014-09-18 16:41:04 +00:00
David Blaikie 9db82cf45d Disable GCC's -Woverloaded-virtual in the configure+make build. Clang's is better.
Turns out Clang's -Woverloaded-virtual is enabled by -Wall in both CMake
and Configure builds. We were only explicitly specifying it (thus
enabling GCC's version of the warning) in the Configure build.

The specific case of interest is:

  struct base {
    virtual void func();
    virtual void func(int);
  };
  struct derived: base {
    virtual void func(); // GCC warns here, because this causes
                         // func(int) to be hidden
  };

I don't think that's worth getting fussed about (& Clang (indirectly
me... since I improved this warning in Clang) agrees or we would've made
the warning catch these cases.

Technically this could still lead to bugs/confusion if base had
func(int) and func(bool), derived overrode func(bool) and then a caller
with a derived object tried to call func(42) - it would silently call
func(bool). We should probably improve clang's warnings to catch this at
the call site at some point.

llvm-svn: 218059
2014-09-18 16:34:25 +00:00
Matt Arsenault 6462f94884 R600: Bug 20982 - Avoid undefined left shift of negative value
I'm not sure what the hardware actually does, so don't
bother trying to fold it for now.

llvm-svn: 218057
2014-09-18 15:52:26 +00:00
Robert Khasanov f70f798474 [SKX] Deriving rmb multiclasses from general one (avx512_icmp_packed_rmb and avx512_icmp_cc_rmb).
Thanks Adam Nemet for notice about this.

llvm-svn: 218051
2014-09-18 14:06:55 +00:00
Aaron Ballman 11fa97fa32 Fixing a bunch of -Woverloaded-virtual warnings due to hiding getSubtargetImpl from the base class. NFC.
llvm-svn: 218050
2014-09-18 13:27:14 +00:00
Patrik Hagglund 07ccb1075a Alternative (to r216344) fix of gcc -Wpedantic.
As suggested by David Blaikie, this may be easier to read.

The original warning was:

../tools/llvm-cov/llvm-cov.cpp:53:49: error: ISO C++ forbids zero-size array 'argv' [-Werror=pedantic]
       std::string Invocation(std::string(argv[0]) + " " + argv[1]);

It seems to be the case that GCC's warning gets confused and thinks
'argv' is a declaration here. GCC bugzilla issue #61259.

llvm-svn: 218048
2014-09-18 11:52:57 +00:00
Frederic Riss be26dfb595 Always emit DW_AT_declaration attribute when the variable isn't a definition.
Summary:
This doesn't show up today as we don't emit decalration only variables. This
will be tested when the followup patches implementing import of forward
declared entities lands in clang.

Reviewers: echristo, dblaikie, aprantl

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D5382

llvm-svn: 218041
2014-09-18 09:38:23 +00:00
Frederic Riss 076801a53b Fix DWARFUnitSection::getUnitForOffset().
The current code is only able to return the right unit if the passed offset
is the exact offset of a section. Generalize the search function by comparing
againt the offset of the next unit instead and by switching the search
algorithm to upper_bound.

This way, the unit returned is the first unit with a getNextUnitOffset()
strictly greater than the searched offset, which is exactly what we want.
Note that there is no need for testing the range of the resulting unit as
the offsets of a DWARFUnitSection are in a single contiguous range from
0 inclusive to lastUnit->getNextUnitOffset() exclusive.

Reviewers: dblaikie samsonov

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D5262

llvm-svn: 218040
2014-09-18 09:38:15 +00:00
Chandler Carruth 9057fcaf82 [x86] Use PALIGNR for v4i32 and v2i64 blends when appropriate.
There is no purpose in using it for single-input shuffles as
pshufd is just as fast and doesn't tie the two operands. This removes
a substantial amount of wrong-domain blend operations in SSSE3 mode. It
also completes the usage of PALIGNR for integer shuffles and addresses
one of the test cases Quentin hit with the new vector shuffle lowering.

There is still the question of whether and when to use this for floating
point shuffles. It is faster than shufps or shufpd but in the integer
domain. I don't yet really have a good heuristic here for when to use
this instruction for floating point vectors.

llvm-svn: 218038
2014-09-18 09:00:25 +00:00
Chandler Carruth 0fe4928fbe [x86] Add an SSSE3 run and check mode to the 128-bit v2 tests of the new
vector shuffle lowering. This will be needed for up-coming palignr
tests.

llvm-svn: 218037
2014-09-18 08:33:04 +00:00
Daniel Sanders e747362b56 [mips] Remove custom versions of CCState::AnalyzeReturn() and CCState::AnalyzeCallReturn().
Summary:
The N32/N64 ABI's return f128 values in $f0 and $f2 for hard-float and $v0 and
$a0 for soft-float. The registers used in the soft-float case differ from the
usual $v0, and $v1 specified for return values.

Both cases were previously handled by duplicating the CCState::AnalyzeReturn()
and CCState::AnalyzeCallReturn() functions and modifying them to delegate to
a different assignment function for f128 and further replace the register type
for the hard-float case. There is a simpler way to do both of these.

We now use the common functions and select an initial assignment function based
on whether the original type is f128 or not. We then handle the hard-float case
using CCBitConvertToType<>.

No functional change.

Reviewers: vmedic

Reviewed By: vmedic

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D5269

llvm-svn: 218036
2014-09-18 08:28:39 +00:00
Juergen Ributzka 1d3a312e2d Revert "[FastISel][AArch64] Fold bit test and branch into TBZ and TBNZ."
Reverting it until I have time to investigate a regression.

llvm-svn: 218035
2014-09-18 08:07:40 +00:00
Juergen Ributzka 0f3076785f Fix previous commit: [FastISel][AArch64] Simplify XALU multiplies.
When folding the intrinsic flag into the branch or select we also have to
consider the fact if the intrinsic got simplified, because it changes the
flag we have to check for.

llvm-svn: 218034
2014-09-18 07:26:26 +00:00
Juergen Ributzka 2964b832ef [FastISel][AArch64] Simplify XALU multiplies.
Simplify {s|u}mul.with.overflow to {s|u}add.with.overflow when possible.

llvm-svn: 218033
2014-09-18 07:04:54 +00:00
Juergen Ributzka 2fc851002b [FastISel][AArch64] Followup commit for 218031 to handle negative offsets too.
llvm-svn: 218032
2014-09-18 07:04:49 +00:00
Juergen Ributzka a33070c321 [FastISel][AArch64] Try to fold the offset into the add instruction when simplifying a memory address.
Small optimization in 'simplifyAddress'. When the offset cannot be encoded in
the load/store instruction, then we need to materialize the address manually.
The add instruction can encode a wider range of immediates than the load/store
instructions. This change tries to fold the offset into the add instruction
first before materializing the offset in a register.

llvm-svn: 218031
2014-09-18 05:40:47 +00:00
Juergen Ributzka 99b7758ba0 [FastISel][AArch64] Fold 'AND' instruction during the address computation.
The 'AND' instruction could be used to mask out the lower 32 bits of a register.
If this is done inside an address computation we might be able to fold the
instruction into the memory instruction itself.

and  x1, x1, #0xffffffff   ---> ldrb x0, [x0, w1, uxtw]
ldrb x0, [x0, x1]

llvm-svn: 218030
2014-09-18 05:40:41 +00:00
Chandler Carruth e0d77ef053 [x86] Add an SSSE3 run to the v4 shuffle test.
llvm-svn: 218028
2014-09-18 04:38:32 +00:00
Saleem Abdulrasool bfdfb14a8f ARM: prevent crash on ELF directives on COFF
Certain directives are unsupported on Windows (some of which could/should be
supported).  We would not diagnose the use but rather crash during the emission
as we try to access the Target Streamer.  Add an assertion to prevent creating a
NULL reference (which is not permitted under C++) as well as a test to ensure
that we can diagnose the disabled directives.

llvm-svn: 218014
2014-09-18 04:28:29 +00:00
Chandler Carruth 867930aadf [x86] Initial step of teaching the new vector shuffle lowering about
PALIGNR. This just adds it to the v8i16 and v16i8 lowering steps where
it is completely unmatched. It also introduces the logic for detecting
rotation shuffle masks even in the presence of single input or blend
masks and arbitrarily undef lanes.

I've added fairly comprehensive tests for the matching logic in v8i16
because the tests at that size are much easier to write and manage.

I've not checked the SSE2 code generated for these tests because the
code is *horrible*. It is absolute madness. Testing it will just make
the test brittle without giving any interesting improvements in the
correctness confidence.

llvm-svn: 218013
2014-09-18 04:11:29 +00:00
Saleem Abdulrasool 8c61c6c0f9 ARM: use a more precise check for MachO
Rather than relying on support for a specific directive to determine if we are
targeting MachO, explicitly check the output format.

As an additional bonus, cleanup the caret diagnostic for the non-MachO case and
avoid the spurious error caused by not discarding the statement.

llvm-svn: 218012
2014-09-18 03:49:55 +00:00
Juergen Ributzka c35fb03661 [FastISel][AArch64] Fold bit test and branch into TBZ and TBNZ.
Teach selectBranch to fold bit test and branch into a single instruction (TBZ or
TBNZ).

llvm-svn: 218010
2014-09-18 02:44:13 +00:00
Eric Christopher d4838554ac Add file to CMake build as well.
llvm-svn: 218005
2014-09-18 00:39:20 +00:00
Eric Christopher d85ffb1fc0 Add a new pass FunctionTargetTransformInfo. This pass serves as a
shim between the TargetTransformInfo immutable pass and the Subtarget
via the TargetMachine and Function. Migrate a single call from
BasicTargetTransformInfo as an example and provide shims where TargetMachine
begins taking a Function to determine the subtarget.

No functional change.

llvm-svn: 218004
2014-09-18 00:34:14 +00:00
Samuel Antao 61570df715 Fix FastISel bug in boolean returns for PowerPC.
For PPC targets, FastISel does not take the sign extension information into account when selecting return instructions whose operands are constants. A consequence of this is that the return of boolean values is not correct. This patch fixes the problem by evaluating the sign extension information also for constants, forwarding this information to PPCMaterializeInt which takes this information to drive the sign extension during the materialization. 

llvm-svn: 217993
2014-09-17 23:25:06 +00:00
Samuel Antao 2fc771b1b6 Remove unnecessary blank space (test commit)
llvm-svn: 217991
2014-09-17 22:47:28 +00:00
David Blaikie dba94ec3c7 Reapply fix in r217988 (reverted in r217989) and remove the alternative fix committed in r217987.
This type isn't owned polymorphically (as demonstrated by making the
dtor protected and everything still compiling) so just address the
warning by protecting the base dtor and making the derived class final.

llvm-svn: 217990
2014-09-17 22:27:36 +00:00
David Blaikie d8978ec085 Revert "Fix -Wnon-virtual-dtor warning introduced in r217982."
An alternative fix was already committed.

This reverts commit r217988.

llvm-svn: 217989
2014-09-17 22:17:59 +00:00
David Blaikie 20dd05ccfd Fix -Wnon-virtual-dtor warning introduced in r217982.
llvm-svn: 217988
2014-09-17 22:15:40 +00:00
Chris Bieneman 4490b9ebb5 Fixing the sanitizer build failure:
http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux/builds/12868/steps/annotate/logs/stdio

llvm-svn: 217987
2014-09-17 22:09:38 +00:00
Juergen Ributzka f6430314b4 [FastISel][AArch64] Custom lower sdiv by power-of-2.
Emit an optimized instruction sequence for sdiv by power-of-2 depending on the
exact flag.

This fixes rdar://problem/18224511.

llvm-svn: 217986
2014-09-17 21:55:55 +00:00
Nick Kledzik 3e95fa431e [llvm-objdump] clean up test cases now that build bots are green
llvm-svn: 217985
2014-09-17 21:53:07 +00:00
Justin Bogner 5cbed6e09e llvm-cov: Push some more debug output into the View (NFC)
llvm-svn: 217984
2014-09-17 21:48:52 +00:00
Chris Bieneman cf93cbb7a4 Fixing a build error.
llvm-svn: 217983
2014-09-17 21:06:59 +00:00
Chris Bieneman ad070d0588 Refactoring SimplifyLibCalls to remove static initializers and generally cleaning up the code.
Summary: This eliminates ~200 lines of code mostly file scoped struct definitions that were unnecessary.

Reviewers: chandlerc, resistor

Reviewed By: resistor

Subscribers: morisset, resistor, llvm-commits

Differential Revision: http://reviews.llvm.org/D5364

llvm-svn: 217982
2014-09-17 20:55:46 +00:00
Rafael Espindola 51bd8ee309 Internalize common symbols when we can.
This fixes pr20974.

llvm-svn: 217981
2014-09-17 20:41:13 +00:00
Juergen Ributzka c611d72754 [FastISel][AArch64] Simplify mul to shift when possible.
This is related to rdar://problem/18369687.

llvm-svn: 217980
2014-09-17 20:35:41 +00:00
Alexey Samsonov 7bddb0a56a Exclude known and bugzilled failures from UBSan bootstrap
llvm-svn: 217979
2014-09-17 20:17:52 +00:00
Juergen Ributzka 3871c69422 [FastISel][AArch64] Fold mul into add/sub and logical operations.
Try to fold the multiply into the add/sub or logical operations (when
possible).

This is related to rdar://problem/18369687.

llvm-svn: 217978
2014-09-17 19:51:38 +00:00
Juergen Ributzka 22d4cd0a4f [FastISel][AArch64] Fold mul into the address computation of memory operations.
Teach 'computeAddress' to also fold multiplies into the address computation
(when possible).

This fixes rdar://problem/18369443.

llvm-svn: 217977
2014-09-17 19:19:31 +00:00
Justin Bogner fe357c003f llvm-cov: Rework the API for getting the coverage of a file (NFC)
This encapsulates how we handle the coverage regions of a file or
function. In the old model, the user had to deal with nested regions,
so they needed to maintain their own auxiliary data structures to get
any useful information out of this. The new API provides a sequence of
non-overlapping coverage segments, which makes it possible to render
coverage information in a single pass and avoids a fair amount of
extra work.

llvm-svn: 217975
2014-09-17 18:23:47 +00:00
Alexey Samsonov 2c8ed4b84f Fixup for r217830. Don't do left shifts on negative values
llvm-svn: 217974
2014-09-17 18:23:07 +00:00
Robin Morisset bf26f8fd56 Revert "[ARM, Fix] Fix emitLeading/TrailingFence on old ARM processors"
It is breaking the build on the buildbots but works fine on my machine, I revert
while trying to understand what happens (it appears to depend on the compiler used
to build, I probably used a C++11 feature that is not perfectly supported by some
of the buildbots).

This reverts commit feb3176c4d006f99af8b40373abd56215a90e7cc.

llvm-svn: 217973
2014-09-17 18:09:13 +00:00
Juergen Ributzka d8e30c0db8 [FastISel][AArch64] Fold compare with zero and branch into CBZ and CBNZ.
This takes advanatage of the CBZ and CBNZ instruction to further optimize the
common null check pattern into a single instruction.

This is related to rdar://problem/18358882.

llvm-svn: 217972
2014-09-17 18:05:34 +00:00
Yaron Keren d122211e60 Another required re-setting for MCStreamer::reset().
llvm-svn: 217970
2014-09-17 17:50:34 +00:00
Matt Arsenault 972c12aedc R600/SI: Remove assert
Since read2 / write2 are emitted for 4-byte aligned 8-byte
accesses, these are seen by the scheduler.

The DAG scheduler is semi-deprecated, so just
ignore these for now.

llvm-svn: 217969
2014-09-17 17:48:32 +00:00
Matt Arsenault 0e75a06451 R600/SI: Rough first implementation of shouldClusterLoads
llvm-svn: 217968
2014-09-17 17:48:30 +00:00
Alexey Samsonov cce5701cdb Fix float division-by-zero in R600 scheduler.
This bug was reported by UBSan.

llvm-svn: 217967
2014-09-17 17:47:21 +00:00
Juergen Ributzka fb3e14375a [FastISel][AArch64] Improve branch selection to support all FP conditions.
This adds the last two missing floating-point condition codes (FCMP_UEQ and
FCMP_ONE) also to the branch selection. In these two cases an additonal branch
instruction is required.

This also adds unit tests to checks all the different condition codes.

This is related o rdar://problem/18358882.

llvm-svn: 217966
2014-09-17 17:46:47 +00:00
Robin Morisset 1c8a457575 [ARM, Fix] Fix emitLeading/TrailingFence on old ARM processors
Summary:
I had only tested this code for ARMv7 and ARMv8. This patch adds several
fallback paths if the processor does not support dmb ish:
- dmb sy if a cortex-M with support for dmb
- mcr p15, #0, r0, c7, c10, #5 for ARMv6 (special instruction equivalent to a DMB)
These fallback paths were chosen based on the code for fence seq_cst.

Thanks to luqmana for having noticed this bug.

Test Plan: Added more cases to atomic-load-store.ll + make check-all

Reviewers: jfb, t.p.northover, luqmana

Subscribers: aemerson, llvm-commits

Differential Revision: http://reviews.llvm.org/D5304

llvm-svn: 217965
2014-09-17 17:41:16 +00:00
Matt Arsenault 02dc26529e R600/SI: Change formatting of printed FP immediates
Only 1 decimal place should be printed for inline immediates.
Other constants should be hex constants.

Does not include f64 tests because folding those inline
immediates currently does not work.

llvm-svn: 217964
2014-09-17 17:32:13 +00:00
Chad Rosier 307b50b0f6 [IndVarSimplify] Partially revert r217953 to see if this fixes the bots.
Specifically, disable widening of unsigned compare instructions.

llvm-svn: 217962
2014-09-17 16:35:09 +00:00
Justin Bogner 69fe4e98fa LineIterator: Provide a variant that keeps blank lines
It isn't always useful to skip blank lines, as evidenced by the
somewhat awkward use of line_iterator in llvm-cov. This adds a knob to
control whether or not to skip blanks.

llvm-svn: 217960
2014-09-17 15:43:01 +00:00
Matt Arsenault 253e5da7ad R600/SI: Remove promotion of instructions to e64 forms.
Instructions are now generally selected to the e64 forms originally,
and shrunk down later. Rename foldOperands to legalizeOperands,
since that's really most of what it tries to do.

llvm-svn: 217959
2014-09-17 15:35:43 +00:00
Chad Rosier bb99f40530 [IndVarSimplify] Widen loop compare instructions.
This improves other optimizations such as LSR.  A sext may be added to the
compare's other operand, but this can often be hoisted outside of the loop.

llvm-svn: 217953
2014-09-17 14:10:33 +00:00
Andrea Di Biagio 5b92b4971a [InstCombine] Fix wrong folding of constant comparison involving ahsr and negative quantities (PR20945).
Example:
define i1 @foo(i32 %a) {
  %shr = ashr i32 -9, %a
  %cmp = icmp ne i32 %shr, -5
  ret i1 %cmp
}

Before this fix, the instruction combiner wrongly thought that %shr
could have never been equal to -5. Therefore, %cmp was always folded to 'true'.
However, when %a is equal to 1, then %cmp evaluates to 'false'. Therefore,
in this example, it is not valid to fold %cmp to 'true'.
The problem was only affecting the case where the comparison was between
negative quantities where one of the quantities was obtained from arithmetic
shift of a negative constant.

This patch fixes the problem with the wrong folding (fixes PR20945).
With this patch, the 'icmp' from the example is now simplified to a
comparison between %a and 1. This still allows us to get rid of the arithmetic
shift (%shr).

llvm-svn: 217950
2014-09-17 11:32:31 +00:00
Frederic Riss 5e6bc9e162 Add DIBuilder functions to build RAUWable DIVariables and DIFunctions.
Summary: These will be used to implement support for useful forward declarartions.

Reviewers: echristo, dblaikie, aprantl

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D5328

llvm-svn: 217949
2014-09-17 09:28:34 +00:00
Yaron Keren 559b47d051 Add and update reset() and doInitialization() methods to MC* and passes.
This enables reusing a PassManager instead of re-constructing it every time.

llvm-svn: 217948
2014-09-17 09:25:36 +00:00
Toma Tabacu 351b2feeb3 [mips] Add assembler support for the .set nodsp directive.
Summary: This directive is used to tell the assembler to reject DSP-specific instructions.

Reviewers: dsanders

Reviewed By: dsanders

Differential Revision: http://reviews.llvm.org/D5142

llvm-svn: 217946
2014-09-17 09:01:54 +00:00
Justin Bogner a187f791e0 llvm-cov: Fix a typo
It doesn't make sense for this default parameter to be false, since
false makes the function a no-op.

llvm-svn: 217945
2014-09-17 08:12:12 +00:00
Pavel Chupin 37b65d81dd [x32] Fix function indirect calls
Summary: Zero-extend register to 64-bit for callq/jmpq.

Test Plan: 3 tests added

Reviewers: nadav, dschuff

Subscribers: llvm-commits, zinovy.nis

Differential Revision: http://reviews.llvm.org/D5355

llvm-svn: 217942
2014-09-17 07:09:23 +00:00
Justin Bogner 99e9518751 Add move constructors/assignment to make MSVC happy after r217940
llvm-svn: 217941
2014-09-17 06:32:48 +00:00
Justin Bogner 5e1400a81c llvm-cov: Distinguish expansion/instantiation from SourceCoverageView
SourceCoverageView currently has "Kind" and a list of child views, all
of which must have either an expansion or an instantiation Kind. In
addition to being an error-prone design, this makes it awkward to
differentiate between the two child types and adds a number of
optionally used members to the type.

Split the subview types into their own separate objects, and maintain
lists of each rather than one combined "Children" list.

llvm-svn: 217940
2014-09-17 05:33:20 +00:00
David Majnemer b435a4214e InstSimplify: Don't allow (x srem y) urem y -> x srem y
Let's consider the case where:
%x i16 = 32768
%y i16 = 384

%x srem %y = 65408
(%x srem %y) urem %y = 128

llvm-svn: 217939
2014-09-17 04:16:35 +00:00
David Majnemer ac717f0972 InstSimplify: ((X % Y) % Y) -> (X % Y)
Patch by Sonam Kumari!

Differential Revision: http://reviews.llvm.org/D5350

llvm-svn: 217937
2014-09-17 03:34:34 +00:00
Nick Kledzik a637536ec1 [Object] keep trailing '\0' out of StringRef when parsing mach-o bindings
llvm-svn: 217935
2014-09-17 01:51:43 +00:00
Richard Trieu 1fbe1a8ba7 | -> ||
No functional change.

llvm-svn: 217934
2014-09-17 01:47:52 +00:00
Nick Kledzik 2d2b254e7c Fix identify_magic() with mach-o stub dylibs.
The wrong value was returned and the unittest did not cover the stub dylib case.

llvm-svn: 217933
2014-09-17 00:53:44 +00:00
Nick Kledzik 3006130a8e [llvm-objdump] properly use c_str() with format("%s"). Improve getLibraryShortNameByIndex() error handling.
llvm-svn: 217930
2014-09-17 00:25:22 +00:00
Robin Morisset 25c8e318e4 [X86] Use the generic AtomicExpandPass instead of X86AtomicExpandPass
This required a new hook called hasLoadLinkedStoreConditional to know whether
to expand atomics to LL/SC (ARM, AArch64, in a future patch Power) or to
CmpXchg (X86).

Apart from that, the new code in AtomicExpandPass is mostly moved from
X86AtomicExpandPass. The main result of this patch is to get rid of that
pass, which had lots of code duplicated with AtomicExpandPass.

llvm-svn: 217928
2014-09-17 00:06:58 +00:00
Quentin Colombet ac55b15bf4 [CodeGenPrepare][AddressingModeMatcher] The promotion mechanism was expecting
instructions when truncate, sext, or zext were created. Fix that.

llvm-svn: 217926
2014-09-16 22:36:07 +00:00
Nick Kledzik abd2987907 [llvm-objdump] improve error reporting of bad mach-o ordinals
llvm-svn: 217909
2014-09-16 22:03:13 +00:00
Yaron Keren cca43c15b5 This add a reset method for WinCOFFObjectWriter, like other MC* classes.
llvm-svn: 217907
2014-09-16 21:31:04 +00:00
Nick Kledzik 53a80d3a46 tweak test case for debugging bot
llvm-svn: 217906
2014-09-16 21:29:54 +00:00
Owen Anderson bfc80a45a7 Add back a fallback case for targets that do not or cannot implement getNoopForMachoTarget().
llvm-svn: 217899
2014-09-16 20:28:00 +00:00
Kevin Enderby 98c9accace Hookup the MCSymbolizer to llvm-objdump’s disassembly for Mach-O files.
First step done in this commit is to get flush out enough of the
SymbolizerGetOpInfo() routine to symbolic an X86_64 hello world .o and
its loading of the literal string and call to printf.  Also the code to
symbolicate the X86_64_RELOC_SUBTRACTOR relocation and a test is also
added to show a slightly more complicated case.

Next will be to flush out enough of SymbolizerSymbolLookUp() to get the
literal string “Hello world” printed as a comment on the instruction that load
the pointer to it.

llvm-svn: 217893
2014-09-16 18:00:57 +00:00
Matt Arsenault 6652403c2d Fix typo
llvm-svn: 217892
2014-09-16 18:00:23 +00:00
Reid Kleckner 7587744359 Add a missing return to operator=
llvm-svn: 217889
2014-09-16 17:39:46 +00:00
Reid Kleckner 6cd75b36ed Fix move-only type issues in Interpreter with MSVC
MSVC 2012 cannot infer any move special members, but it will call them
if available. MSVC 2013 cannot infer move assignment. Therefore,
explicitly implement the special members for the ExecutionContext class
and its contained types.

llvm-svn: 217887
2014-09-16 17:28:15 +00:00
Adam Nemet e5a07167f5 [TableGen] Fully resolve class-instance values before defs in multiclasses
By class-instance values I mean 'Class<Arg>' in 'Class<Arg>.Field' or in
'Other<Class<Arg>>' (syntactically s SimpleValue).  This is to differentiate
from unnamed/anonymous record definitions (syntactically an ObjectBody) which
are not affected by this change.

Consider the testcase:

    class Struct<int i> {
      int I = !shl(i, 1);
      int J = !shl(I, 1);
    }

    class Class<Struct s> {
        int Class_J = s.J;
    }

    multiclass MultiClass<int i> {
      def Def : Class<Struct<i>>;
    }

    defm Defm : MultiClass<2>;

Before this fix, DefmDef.Class_J yields !shl(I, 1) instead of 8.

This is the sequence of events.  We start with this:

    multiclass MultiClass<int i> {
      def Def : Class<Struct<i>>;
    }

During ParseDef the anonymous object for the class-instance value is created:

    multiclass Multiclass<int i> {
      def anonymous_0 : Struct<i>;

      def Def : Class<NAME#anonymous_0>;
    }

Then class Struct<i> is added to anonymous_0.  Also Class<NAME#anonymous_0> is
added to Def:

    multiclass Multiclass<int i> {
      def anonymous_0 {
        int I = !shl(i, 1);
        int J = !shl(I, 1);
      }

      def Def {
        int Class_J = NAME#anonymous_0.J;
      }
    }

So far so good but then we move on to instantiating this in the defm
by substituting the template arg 'i'.

This is how the anonymous prototype looks after fully instantiating.

    defm Defm = {
      def Defmanonymous_0 {
         int I = 4;
         int J = !shl(I, 1);
      }

Note that we only resolved the reference to the template arg.  The
non-template-arg reference in 'J' has not been resolved yet.

Then we go on to instantiating the Def prototype:

      def DefmDef {
         int Class_J = NAME#anonymous_0.J;
      }

Which is resolved to Defmanonymous_0.J and then to !shl(I, 1).

When we fully resolve each record in a defm, Defmanonymous_0.J does get set
to 8 but that's too late for its use.

The patch adds a new attribute to the Record class that indicates that this
def is actually a class-instance value that may be *used* by other defs in a
multiclass.  (This is unlike regular defs which don't reference each other and
thus can be resolved indepedently.)  They are then fully resolved before the
other defs while the multiclass is instantiated.

I added vg_leak to the new test.  I am not sure if this is necessary but I
don't think I have a way to test it.  I can also check in without the XFAIL
and let the bots test this part.

Also tested that X86.td.expanded and AAarch64.td.expanded were unchange before
and after this change.  (This issue triggering this problem is a WIP patch.)

Part of <rdar://problem/17688758>

llvm-svn: 217886
2014-09-16 17:14:13 +00:00
Adam Nemet 0c7caf434f [X86] Improve comment
llvm-svn: 217885
2014-09-16 17:14:10 +00:00
Moritz Roth eef9f4dc74 ARM load/store optimizer: Don't materialize a new base register with
ADDS/SUBS unless it's safe to clobber the condition flags.

If the merged instructions are in a range where the CPSR is live,
e.g. between a CMP -> Bcc, we can't safely materialize a new base
register.

This problem is quite rare, I couldn't come up with a test case and I've
never actually seen this happen in the tests I'm running - there is a
potential trigger for this in LNT/oggenc (spills being inserted between
a CMP/Bcc), but at the moment this isn't being merged. I'll try to
reduce that into a small test case once I've committed my upcoming patch
to make merging less conservative.

llvm-svn: 217881
2014-09-16 16:25:07 +00:00
Benjamin Kramer 4be38d724f Spell out a move ctor. Even the 2013 vintage of MSVC cannot synthesize move ctors.
llvm-svn: 217879
2014-09-16 16:16:39 +00:00
Benjamin Kramer 6363eb2d11 Interpreter: Hack around a series of bugs in MSVC 2012 that copies around this
move-only struct.

I feel terrible now, but at least it's shielded away from proper compilers.

llvm-svn: 217875
2014-09-16 15:26:41 +00:00
Toma Tabacu 65f1057191 [mips] Improve the error messages given by MipsAsmParser.
Summary: Changed error messages to be more informative and to resemble other clang/llvm error messages (first letter is lower case, no ending punctuation) and updated corresponding tests.

Reviewers: dsanders

Reviewed By: dsanders

Differential Revision: http://reviews.llvm.org/D5065

llvm-svn: 217873
2014-09-16 15:00:52 +00:00
Frederic Riss f459fabfb3 Make DWARFUnitSection final and change base class to non-virtual protected destructor.
As per dblaikie suggestion.

llvm-svn: 217871
2014-09-16 12:58:01 +00:00
Toma Tabacu 18227e6f20 [mips] Move 32-bit ADDiu instruction alias from Mips64InstrInfo.td to MipsInstrInfo.td.
Patch by Vasileios Kalintiris.

Differential Revision: http://reviews.llvm.org/D5244

llvm-svn: 217868
2014-09-16 10:19:03 +00:00
Toma Tabacu 25cdd222b0 [mips] Marked the ADDi instruction aliases as not available in Mips32R6 and Mips64R6.
Patch by Vasileios Kalintiris.

Differential Revision: http://reviews.llvm.org/D5242

llvm-svn: 217867
2014-09-16 09:26:09 +00:00
Joe Abbey 8e72eb780e ARMAsmBackend uses a factory method to generate binary file format specific
objects.  There were a few FIXMEs in ARMAsmBackend.cpp suggesting the class
definitions should be in a separate file.  Starting with ARMAsmBackend, the
class definition has been put in a header file, and #includes reduced.  Each
sub-type of ARMAsmBackend is now in its own header file.

Derived types have been painted with a different color of bike-shed:

  s/DarwinARMAsmBackend/ARMAsmBackendDarwin/g
  s/ARMWinCOFFAsmBackend/ARMAsmBackendWinCOFF/g
  s/ELFARMAsmBackend/ARMAsmBackendELF/g

Finally, clang-format has been run across ARMAsmBackend.cpp

llvm-svn: 217866
2014-09-16 09:18:23 +00:00
Tilmann Scheller 40fc9595c8 [InstCombine] Remove redundant test case.
Patch by Sonam Kumari!

Differential Revision: http://reviews.llvm.org/D5284

llvm-svn: 217865
2014-09-16 08:50:10 +00:00
Elena Demikhovsky 27012478d2 AVX-512: added cost for some AVX-512 instructions
llvm-svn: 217863
2014-09-16 07:57:37 +00:00
Justin Bogner 76e251c03b llvm-cov: Rename a variable and clean up its usage
Offset is a terrible name for an indentation / nesting level, and it
confuses me every time I look at this code.

llvm-svn: 217861
2014-09-16 06:21:57 +00:00
Nick Kledzik c17c8093db tweak test case to help build bot
llvm-svn: 217860
2014-09-16 04:51:38 +00:00
Hal Finkel cc4f31d3d7 Fix BasicTTI::getCmpSelInstrCost to deal with illegal vector types
The default implementation of getCmpSelInstrCost, which provides the cost of
icmp/fcmp/select instructions, did not deal sensibly with illegal vector types
that were scalarized. We'd ask for the legalization cost of the vector type,
which would return something like (4, f64) given an input of <4 x double>, and
we'd then check the TLI status of the ISD opcode on that scalar type. This would
result in querying (ISD::VSELECT, f64), for example. Amusingly enough,
ISD::VSELECT on scalar types is marked as Legal by default (as with most other
operations), and most backends never change this because VSELECT is never
generated on scalars. However, seeing the resulting operation as Legal, we'd
neglect to add the scalarization cost before returning. The result is that we'd
grossly under-estimate the cost of cmps/selects on illegal vector types.

Now, if type legalization clearly results in scalarization, we skip the early
return and add the scalarization cost.

llvm-svn: 217859
2014-09-16 04:35:50 +00:00
David Majnemer 2cbc13878f yaml2obj: Support bigobj
Teach yaml2obj how to make a bigobj COFF file.  Like the rest of LLVM,
we automatically decide whether or not to use regular COFF or bigobj
COFF on the fly depending on how many sections the resulting object
would have.

This ends the task of adding bigobj support to LLVM.

N.B. This was tested by forcing yaml2obj to be used in bigobj mode
regardless of the number of sections.  While a dedicated test was
written, the smallest I could make it was 36 MB (!) of yaml and it still
took a significant amount of time to execute on a powerful machine.

llvm-svn: 217858
2014-09-16 03:52:46 +00:00
Nick Kledzik c1a750bba6 tweak test case to help solve why failing on one build bot
llvm-svn: 217856
2014-09-16 02:33:36 +00:00
Chandler Carruth 429c29d187 [x86] Remove a FIXME that doesn't make any sense. Only the lanes feeding
the blend that is matched by this are "used" in any sense, and so any
build_vector or other nodes feeding these will already drop other lanes.

llvm-svn: 217855
2014-09-16 02:16:42 +00:00
Chandler Carruth b1c024a2de [x86] Cleanup an unused variable by actually using it in the non-asserts
place where it was needed.

llvm-svn: 217854
2014-09-16 02:14:51 +00:00
Nick Kledzik 56ebef45ef [llvm-objdump] for mach-o add -bind, -lazy-bind, and -weak-bind options
This finishes the ability of llvm-objdump to print out all information from
the LC_DYLD_INFO load command.

The -bind option prints out symbolic references that dyld must resolve 
immediately.

The -lazy-bind option prints out symbolc reference that are lazily resolved on 
first use.

The -weak-bind option prints out information about symbols which dyld must
try to coalesce across images.

llvm-svn: 217853
2014-09-16 01:41:51 +00:00
Chandler Carruth 74acb46d26 [x86] Remove the last vestiges of the BLENDI-based ADDSUB pattern
matching. This design just fundamentally didn't work because ADDSUB is
available prior to any legal lowerings of BLENDI nodes. Instead, we have
a dedicated ADDSUB synthetic ISD node which is pattern matched trivially
into the instructions. These nodes are then recognized by both the
existing and a trivial new lowering combine in the backend. Removing
these patterns required adding 2 missing shuffle masks to the DAG
combine, without which tests would have failed. Added the masks and
a helpful assert as well to catch if anything ever goes wrong here.

llvm-svn: 217851
2014-09-16 00:39:08 +00:00
Juergen Ributzka 59e631c728 [FastISel][AArch64] Add vector support to argument lowering.
Lower the first 8 vector arguments too.

llvm-svn: 217850
2014-09-16 00:25:30 +00:00
Chandler Carruth f845e89425 [x86] As a follow-up to r217819, don't check for VSELECT legality now
that we don't use VSELECT and directly emit an addsub synthetic node.
Also remove a stale comment referencing VSELECT.

The test case is updated to use 'core2' which only has SSE3, not SSE4.1,
and it still passes. Previously it would not because we lacked
sufficient blend support to legalize the VSELECT.

llvm-svn: 217849
2014-09-16 00:24:42 +00:00
Chandler Carruth de5f2b356b [x86] Add the beginnings of a proper DAG combine to match ADDSUBPS and
ADDSUBPD nodes out of blends of adds and subs.

This allows us to actually form these instructions with SSE3 rather than
only forming them when we had both SSE3 for the ADDSUB instructions and
SSE4.1 for the blend instructions. ;] Kind-of important.

I've adjusted the CPU requirements on one of the tests to demonstrate
this kicking in nicely for an SSE3 cpu configuration.

llvm-svn: 217848
2014-09-16 00:15:20 +00:00
Juergen Ributzka f693787ed0 [FastISel][AArch64] Add missing test case for previous commit.
This adds the missing test case for the previous commit:
Allow handling of vectors during return lowering for little endian machines.

Sorry for the noise.

llvm-svn: 217847
2014-09-15 23:47:57 +00:00
Juergen Ributzka de47c47cc1 [FastISel][AArch64] Allow handling of vectors during return lowering for little endian machines.
Allow handling of vectors during return lowering at least for little endian machines.
This was restricted in r208200 to fix it for big endian machines (according to
the comment), but it also disabled it for little endian too.

llvm-svn: 217846
2014-09-15 23:40:10 +00:00
Juergen Ributzka b9e49c73ee [FastISel][AArch64] Update function and variable names to follow the coding standard. NFC.
llvm-svn: 217845
2014-09-15 23:20:17 +00:00
David Blaikie ba656e1d7c DebugInfo: Add comment describing the need to disable address pool usage in skeleton units.
Post commit review from Eric Christopher.

llvm-svn: 217842
2014-09-15 22:41:25 +00:00
Juergen Ributzka cbe802e730 [FastISel][AArch64] Make AArch64FastISel class final. NFC.
llvm-svn: 217840
2014-09-15 22:33:11 +00:00
Juergen Ributzka 993224a553 [FastISel][AArch64] Lower sin/cos/pow to runtime lib calls.
Also lower sin/cos/pow to runtime lib calls.

This fixes rdar://problem/18343468.

llvm-svn: 217839
2014-09-15 22:33:06 +00:00
Justin Bogner 92bb302314 llvm-cov: Make debug output more consistent
This changes the debug output of the llvm-cov tool to consistently
write to stderr, and moves the highlighting output closer to where
it's relevant.

llvm-svn: 217838
2014-09-15 22:23:29 +00:00
David Blaikie 8dee57a75d Fix indenting caused by clang-format+spuriously indented access specifier in r216925
Caught in post-commit review by Justin Bogner.

llvm-svn: 217837
2014-09-15 22:20:31 +00:00
Justin Bogner 0b3614f806 llvm-cov: Fix an issue with showing regions but not counts
In r217746, though it was supposed to be NFC, I broke llvm-cov's
handling of showing regions without showing counts. This should've
shown up in the existing tests, except they were checking debug output
that was displayed regardless of what was actually output. I've moved
the relevant debug output to a more appropriate place so that the
tests catch this kind of thing.

llvm-svn: 217835
2014-09-15 22:12:28 +00:00
Rafael Espindola 9dd2d5810f Add back tests for empty function in SPARC and PowerPC.
llvm-svn: 217834
2014-09-15 22:11:07 +00:00
Juergen Ributzka afa034fb61 [FastISel][AArch64] Add lowering support for frem.
This lowers frem to a runtime libcall inside fast-isel.

The test case also checks the CallLoweringInfo bug that was exposed by this
change.

This fixes rdar://problem/18342783.

llvm-svn: 217833
2014-09-15 22:07:49 +00:00
Juergen Ributzka 3c5f180255 [FastISel] Fix a bug in FastISel::CallLoweringInfo.
This fixes a bug in FastISel::CallLoweringInfo, where the number of
arguments was obtained from the argument vector before it had been
initialized.

Test case follows in another commit.

llvm-svn: 217832
2014-09-15 22:07:44 +00:00
Sanjay Patel d4f4c4e416 Replace repeated null checks with an assert. NFC.
Without a vector to hold the created ops, these 
functions don't have any use.

llvm-svn: 217831
2014-09-15 21:52:51 +00:00
Nick Kledzik d7679269a9 [Support] add decodeSLEB128()
We already have routines to encode SLEB128 as well as encode/decode ULEB128.
This last function fills out the matrix.  I'll need this for some llvm-objdump
work I am doing.

llvm-svn: 217830
2014-09-15 21:51:49 +00:00
Juergen Ributzka e1779e2a8b [FastISel][AArch64] Refactor selectAddSub, selectLogicalOp, and SelectShift. NFC.
Small refactor to tidy up the code a little.

llvm-svn: 217827
2014-09-15 21:27:56 +00:00
Juergen Ributzka 6127b1968d [FastISel][AArch64] Refactor code to use isTypeSupported. NFC.
Gets rid of isLoadStoreTypeLegal and replace it with isTypeSupported.

llvm-svn: 217826
2014-09-15 21:27:54 +00:00
Jingyue Wu b67140b812 Remove dead code in SimplifyCFG
Summary: UsedByBranch is always true according to how BonusInst is defined.

Test Plan:
Passes check-all, and also verified 

if (BonusInst && !UsedByBranch) {
  ...
}

is never entered during check-all.

Reviewers: resistor, nadav, jingyue

Reviewed By: jingyue

Subscribers: llvm-commits, eliben, meheff

Differential Revision: http://reviews.llvm.org/D5324

llvm-svn: 217824
2014-09-15 20:48:13 +00:00
Juergen Ributzka 8984f48d89 [FastISel][AArch64] Improve floating-point compare support.
Add support for the last two missing fcmp condition codes: UEQ and ONE.

This fixes rdar://problem/18341575.

llvm-svn: 217823
2014-09-15 20:47:16 +00:00
Juergen Ributzka d111d29f90 [FastISel] Move optimizeCmpPredicate to FastISel base class. NFC.
Make the optimizeCmpPredicate function available to all targets.

llvm-svn: 217822
2014-09-15 20:47:13 +00:00
Reed Kotler 32be74b178 Add mips32 r1 to the list of supported targets for Mips fast-isel
Summary:
Expand list of supported targets for Mips to include mips32 r1.
Previously it only include r2. More patches are coming where there is 
a difference but in the current patches as pushed upstream, r1 and r2
are equivalent.

Test Plan:
simplestorefp1.ll

add new build bots at mips to test this flavor at both -O0 and -O2

Reviewers: dsanders

Reviewed By: dsanders

Differential Revision: http://reviews.llvm.org/D5306

llvm-svn: 217821
2014-09-15 20:30:25 +00:00
David Majnemer 15f7ed96ac Fix the build for MSVC, it doesn't support extended sizeof
llvm-svn: 217820
2014-09-15 20:28:38 +00:00
Chandler Carruth 204ad4c613 [x86] Start fixing our emission of ADDSUBPS and ADDSUBPD instructions by
introducing a synthetic X86 ISD node representing this generic
operation.

The relevant patterns for mapping these nodes into the concrete
instructions are also added, and a gnarly bit of C++ code in the
target-specific DAG combiner is replaced with simple code emitting this
primitive.

The next step is to generically combine blends of adds and subs into
this node so that we can drop the reliance on an SSE4.1 ISD node
(BLENDI) when matching an SSE3 feature (ADDSUB).

llvm-svn: 217819
2014-09-15 20:09:47 +00:00