Commit Graph

67743 Commits

Author SHA1 Message Date
Tim Northover 38a93aaaa1 AArch64: error when both positional & named operands are used.
Only one instruction pair needed changing: SMULH & UMULH. The previous
code worked, but MC was doing extra work treating Ra as a valid
operand (which then got completely overwritten in MCCodeEmitter).

No behaviour change, so no tests.

llvm-svn: 203772
2014-03-13 09:00:13 +00:00
Alexey Samsonov 96dc29c028 [C++11] DWARF parser: use SmallVector<std::unique_ptr> for parsed units in DWARFContext, and delete custom destructors
llvm-svn: 203770
2014-03-13 08:19:59 +00:00
Hal Finkel 27774d9274 [PowerPC] Initial support for the VSX instruction set
VSX is an ISA extension supported on the POWER7 and later cores that enhances
floating-point vector and scalar capabilities. Among other things, this adds
<2 x double> support and generally helps to reduce register pressure.

The interesting part of this ISA feature is the register configuration: there
are 64 new 128-bit vector registers, the 32 of which are super-registers of the
existing 32 scalar floating-point registers, and the second 32 of which overlap
with the 32 Altivec vector registers. This makes things like vector insertion
and extraction tricky: this can be free but only if we force a restriction to
the right register subclass when needed. A new "minipass" PPCVSXCopy takes care
of this (although it could do a more-optimal job of it; see the comment about
unnecessary copies below).

Please note that, currently, VSX is not enabled by default when targeting
anything because it is not yet ready for that.  The assembler and disassembler
are fully implemented and tested. However:

 - CodeGen support causes miscompiles; test-suite runtime failures:
      MultiSource/Benchmarks/FreeBench/distray/distray
      MultiSource/Benchmarks/McCat/08-main/main
      MultiSource/Benchmarks/Olden/voronoi/voronoi
      MultiSource/Benchmarks/mafft/pairlocalalign
      MultiSource/Benchmarks/tramp3d-v4/tramp3d-v4
      SingleSource/Benchmarks/CoyoteBench/almabench
      SingleSource/Benchmarks/Misc/matmul_f64_4x4

 - The lowering currently falls back to using Altivec instructions far more
   than it should. Worse, there are some things that are scalarized through the
   stack that shouldn't be.

 - A lot of unnecessary copies make it past the optimizers, and this needs to
   be fixed.

 - Many more regression tests are needed.

Normally, I'd fix these things prior to committing, but there are some
students and other contributors who would like to work this, and so it makes
sense to move this development process upstream where it can be subject to the
regular code-review procedures.

llvm-svn: 203768
2014-03-13 07:58:58 +00:00
Hal Finkel 5457bd08cb [TableGen] Optionally forbid overlap between named and positional operands
There are currently two schemes for mapping instruction operands to
instruction-format variables for generating the instruction encoders and
decoders for the assembler and disassembler respectively: a) to map by name and
b) to map by position.

In the long run, we'd like to remove the position-based scheme and use only
name-based mapping. Unfortunately, the name-based scheme currently cannot deal
with complex operands (those with suboperands), and so we currently must use
the position-based scheme for those. On the other hand, the position-based
scheme cannot deal with (register) variables that are split into multiple
ranges. An upcoming commit to the PowerPC backend (adding VSX support) will
require this capability. While we could teach the position-based scheme to
handle that, since we'd like to move away from the position-based mapping
generally, it seems silly to teach it new tricks now. What makes more sense is
to allow for partial transitioning: use the name-based mapping when possible,
and only use the position-based scheme when necessary.

Now the problem is that mixing the two sensibly was not possible: the
position-based mapping would map based on position, but would not skip those
variables that were mapped by name. Instead, the two sets of assignments would
overlap. However, I cannot currently change the current behavior, because there
are some backends that rely on it [I think mistakenly, but I'll send a message
to llvmdev about that]. So I've added a new TableGen bit variable:
noNamedPositionallyEncodedOperands, that can be used to cause the
position-based mapping to skip variables mapped by name.

llvm-svn: 203767
2014-03-13 07:57:54 +00:00
Alexey Samsonov 1eabf98b32 [C++11] Convert DWARF parser to range-based for loops
llvm-svn: 203766
2014-03-13 07:52:54 +00:00
Saleem Abdulrasool aae4dc21ea ARM: ignore unused variable to fix -Wunused-variable builds
llvm-svn: 203765
2014-03-13 07:15:45 +00:00
Saleem Abdulrasool 324133910a MC: fix silly typo
llvm-svn: 203763
2014-03-13 07:02:46 +00:00
Saleem Abdulrasool dadf94ce84 ARM: support emission of complex SO expressions
Support to the IAS was added to actually parse and handle the complex SO
expressions.  However, the object file lowering was not updated to compensate
for the fact that the shift operand may be an absolute expression.

When trying to assemble to an object file, the lowering would fail while
succeeding when emitting purely assembly.  Add an appropriate test.

The test case is inspired by the test case provided by Jiangning Liu who also
brought the issue to light.

llvm-svn: 203762
2014-03-13 07:02:41 +00:00
Saleem Abdulrasool 9b7c0af292 Support: add support to identify WinCOFF/ARM objects
Add the Windows COFF ARM object file magic.  This enables the LLVM tools to
interact with COFF object files for Windows on ARM.

llvm-svn: 203761
2014-03-13 07:02:35 +00:00
Owen Anderson abb90c9ddb Phase 1 of refactoring the MachineRegisterInfo iterators to make them suitable
for use with C++11 range-based for-loops.

The gist of phase 1 is to remove the skipInstruction() and skipBundle()
methods from these iterators, instead splitting each iterator into a version
that walks operands, a version that walks instructions, and a version that
walks bundles.  This has the result of making some "clever" loops in lib/CodeGen
more verbose, but also makes their iterator invalidation characteristics much
more obvious to the casual reader. (Making them concise again in the future is a
good motivating case for a pre-incrementing range adapter!)

Phase 2 of this undertaking with consist of removing the getOperand() method,
and changing operator*() of the operand-walker to return a MachineOperand&.  At
that point, it should be possible to add range views for them that work as one
might expect.

llvm-svn: 203757
2014-03-13 06:02:25 +00:00
Saleem Abdulrasool ac58e9fc0b MC: fix possible NULL pointer dereference
Avoid NULL pointer scenario found via clang's static analyzer.

llvm-svn: 203745
2014-03-13 02:09:51 +00:00
Mark Seaborn 07e7486128 Fix typo in comment: "inwoke" -> "invoke"
llvm-svn: 203739
2014-03-13 00:04:17 +00:00
David Blaikie f4ad698336 MCDwarf: Remove unused parameter
llvm-svn: 203727
2014-03-12 22:35:23 +00:00
David Blaikie a55e64f84a MCDwarf: Invert the Section+CU->LineEntries mapping so the CU is the primary dimension
This makes the mapping consistent with other CU->X mappings in the
MCContext, helping pave the way to refactor all these values into a
single data structure per CU and thus a single map.

I haven't renamed the data structure as that would make the patch churn
even higher (the MCLineSection name no longer makes sense, as this
structure now contains lines for multiple sections covered by a single
CU, rather than lines for a single section in multiple CUs) and further
refactorings will follow that may remove this type entirely.

For convenience, I also gave the MCLineSection value semantics so we
didn't have to do the lazy construction, manual delete, etc.

(& for those playing at home, refactoring the line printing into a
single data structure will eventually alow that data structure to be
reused to own the debug_line.dwo line table used for type unit file name
resolution)

llvm-svn: 203726
2014-03-12 22:28:56 +00:00
Justin Bogner ec49f9820c Back out Profile library and dependent commits
Chandler voiced some concern with checking this in without some
discussion first. Reverting for now.

This reverts r203703, r203704, r203708, and 203709.

llvm-svn: 203723
2014-03-12 22:00:57 +00:00
Michael Zolotukhin 66806aef1e PR17473:
Don't normalize an expression during postinc transformation unless it's
invertible.

llvm-svn: 203719
2014-03-12 21:31:05 +00:00
Adam Nemet d4e56073c7 [X86] Add peephole for masked rotate amount
Extend what's currently done for shift because the HW performs this masking
implicitly:

   (rotl:i32 x, (and y, 31)) -> (rotl:i32 x, y)

I use the newly factored out multiclass that was only supporting shifts so
far.

For testing I extended my testcase for the new rotation idiom.

<rdar://problem/15295856>

llvm-svn: 203718
2014-03-12 21:20:55 +00:00
Michael Zolotukhin 15e6e543b9 Test commit
llvm-svn: 203716
2014-03-12 21:15:56 +00:00
Justin Bogner a1f278f96c Profile: Remove an inefficient and unnecessary API function
This was leftover from an approach I abandoned, but I forgot to update
it before committing.

llvm-svn: 203708
2014-03-12 20:26:37 +00:00
Raul E. Silvera 62f0236d36 Resubmit "[SLPV] Recognize vectorizable intrinsics during SLP vectorization ..."
This reverts commit 86cb795388643710dab34941ddcb5a9470ac39d8.
The problems previously found have been resolved through other CLs.

llvm-svn: 203707
2014-03-12 20:21:50 +00:00
Rafael Espindola f3336bc1d5 Reject alias to undefined symbols in the verifier.
On ELF and COFF an alias is just another name for a position in the file.
There is no way to refer to a position in another file, so an alias to
undefined is meaningless.

MachO currently doesn't support aliases. The spec has a N_INDR, which when
implemented will have a different set of restrictions. Adding support for
it shouldn't be harder than any other IR extension.

For now, having the IR represent what is actually possible with current
tools makes it easier to fix the design of GlobalAlias.

llvm-svn: 203705
2014-03-12 20:15:49 +00:00
Justin Bogner a2e0368994 Profile: Add a library for the instrumentation based profiling format
This provides a library to work with the instrumentation based
profiling format that is used by clang's -fprofile-instr-* options and
by the llvm-profdata tool. This is a binary format, rather than the
textual one that's currently in use.

The tests are in the subsequent commits that use this.

llvm-svn: 203703
2014-03-12 20:14:05 +00:00
Roman Divacky a26f9a6a42 Allow exclamation and tilde to be parsed as a part of the ppc asm operand.
llvm-svn: 203699
2014-03-12 19:25:57 +00:00
Matt Arsenault e389dd5d68 R600: Fix trunc store from i64 to i1
llvm-svn: 203695
2014-03-12 18:45:52 +00:00
Hans Wennborg b73c0b041d Allow switch-to-lookup table for tables with holes by adding bitmask check
This allows us to generate table lookups for code such as:

  unsigned test(unsigned x) {
    switch (x) {
      case 100: return 0;
      case 101: return 1;
      case 103: return 2;
      case 105: return 3;
      case 107: return 4;
      case 109: return 5;
      case 110: return 6;
      default: return f(x);
    }
  }

Since cases 102, 104, etc. are not constants, the lookup table has holes
in those positions. We therefore guard the table lookup with a bitmask check.

Patch by Jasper Neumann!

llvm-svn: 203694
2014-03-12 18:35:40 +00:00
Eric Christopher 8cc04fc40d When computing the size of a base type be conservative if the type
is a declaration and return the size of the type.

llvm-svn: 203690
2014-03-12 18:18:05 +00:00
Evan Cheng ad6efbfa0f Revert r203488 and r203520.
llvm-svn: 203687
2014-03-12 18:09:37 +00:00
Rafael Espindola 2e43aff460 Avoid repeated calls to CE->getOperand(0). No functionality change.
llvm-svn: 203686
2014-03-12 18:08:14 +00:00
Adam Nemet b667c3fc26 [X86] Refactor peepholes for masked shift amount into a multiclass
The peephole (shift x, (and y, 31)) -> (shift x, y) is repeated for each
integer type and each shift variant.

To improve this a new multiclass is added that covers all integer types.  The
shift patterns are now instantiated from this.  I am planning to add new
instances for rotates as well.

No functional change intended:

  * test/CodeGen/X86/shift-and.ll provides coverage

  * Compared the expanded tablegen output and matched up the defs for these
    Pat<>s before and after

llvm-svn: 203685
2014-03-12 18:02:33 +00:00
Quentin Colombet b5e41ea144 [X86] Set the scheduling resources of some of the FPStack instructions.
This is related to <rdar://problem/15607571>.

llvm-svn: 203682
2014-03-12 17:33:42 +00:00
Eric Christopher 1acdbb8856 Use values we've already computed, update comment.
No functional change.

llvm-svn: 203681
2014-03-12 17:14:46 +00:00
Eric Christopher 7924e0cca2 Turn on hashing by default for split dwarf compile units.
llvm-svn: 203680
2014-03-12 17:14:43 +00:00
David Blaikie c47d084650 Correct typo ("a entry" -> "an entry")
llvm-svn: 203678
2014-03-12 16:56:05 +00:00
Rafael Espindola 3d5d464df8 Try harder to evaluate expressions when printing assembly.
When printing assembly we don't have a Layout object, but we can still
try to fold some constants.

Testcase by Ulrich Weigand.

llvm-svn: 203677
2014-03-12 16:55:59 +00:00
David Blaikie 7066f7bc39 DebugInfo: Use common line/file attribute construction code
llvm-svn: 203676
2014-03-12 16:51:06 +00:00
Eli Bendersky 95b540f221 Revive SizeOptLevel-explaining comments that were dropped in r203669
llvm-svn: 203675
2014-03-12 16:44:17 +00:00
Hans Wennborg 6693c673a1 Add comment pointing to the binutils bugzilla entry
This is a follow-up to r203635 as suggested by Rafael.

llvm-svn: 203670
2014-03-12 16:14:23 +00:00
Eli Bendersky 49f6565267 Move duplicated code into a helper function (exposed through overload).
There's a bit of duplicated "magic" code in opt.cpp and Clang's CodeGen that
computes the inliner threshold from opt level and size opt level.

This patch moves the code to a function that lives alongside the inliner itself,
providing a convenient overload to the inliner creation.

A separate patch can be committed to Clang to use this once it's committed to
LLVM. Standalone tools that use the inlining pass can also avoid duplicating
this code and fearing it will go out of sync.

Note: this patch also restructures the conditinal logic of the computation to
be cleaner.

llvm-svn: 203669
2014-03-12 16:12:36 +00:00
Will Schmidt acae468c8e Update the datalayout string for ppc64LE.
Update the datalayout string for ppc64LE.

llvm-svn: 203664
2014-03-12 14:59:17 +00:00
Alon Mishne 07d949f39a Cloning a function now also clones its debug metadata if 'ModuleLevelChanges' is true.
llvm-svn: 203662
2014-03-12 14:42:51 +00:00
Daniel Sanders 61c76cc56f [mips][fp64] Add an implicit def to MTHC1 claiming that it reads the lower 32-bits of 64-bit FPR
Summary:
This is a white lie to workaround a widespread bug in the -mfp64
implementation.

The problem is that none of the 32-bit fpu ops mention the fact that they
clobber the upper 32-bits of the 64-bit FPR. This allows MTHC1 to be
scheduled on the wrong side of most 32-bit FPU ops, particularly MTC1.
Fixing that requires a major overhaul of the FPU implementation which can't
be done right now due to time constraints.

The testcase is SingleSource/Benchmarks/Misc/oourafft.c when given
TARGET_CFLAGS='-mips32r2 mfp64 -mmsa'.

Also correct the comment added in r203464 to indicate that two
instructions were affected.

Reviewers: matheusalmeida, jacksprat

Reviewed By: matheusalmeida

Differential Revision: http://llvm-reviews.chandlerc.com/D3029

llvm-svn: 203659
2014-03-12 13:35:43 +00:00
Daniel Sanders df22154579 [mips] BSEL's and BINS[RL] operands are reversed compared to the vselect node used in the pattern.
Summary:
Correct the match patterns and the lowerings that made the CodeGen tests pass despite the mistakes.

The original testcase that discovered the problem was SingleSource/UnitTests/SignlessType/factor.c in test-suite.
During review, we also found that some of the existing CodeGen tests were incorrect and fixed them:
* bitwise.ll: In bsel_v16i8 the IfSet/IfClear were reversed because bsel and bmnz have different operand orders and the test didn't correctly account for this. bmnz goes 'IfClear, IfSet, CondMask', while bsel goes 'CondMask, IfClear, IfSet'.
* vec.ll: In the cases where a bsel is emitted as a bmnz (they are the same operation with a different input tied to the result) the operands were in the wrong order.
* compare.ll and compare_float.ll: The bsel operand order was correct for a greater-than comparison, but a greater-than comparison instruction doesn't exist. Lowering this operation inverts the condition so the IfSet/IfClear need to be swapped to match.

The differences between BSEL, BMNZ, and BMZ and how they map to/from vselect are rather confusing. I've therefore added a note to MSA.txt to explain this in a single place in addition to the comments that explain each case.

Reviewers: matheusalmeida, jacksprat

Reviewed By: matheusalmeida

Differential Revision: http://llvm-reviews.chandlerc.com/D3028

llvm-svn: 203657
2014-03-12 11:54:00 +00:00
Tim Northover 3cccc45a9f ARM: correct Dwarf output for non-contiguous VFP saves.
When the list of VFP registers to be saved was non-contiguous (so multiple
vpush/vpop instructions were needed) these were being ordered oddly, as in:
    vpush {d8, d9}
    vpush {d11}

This led to the layout in memory being [d11, d8, d9] which is ugly and doesn't
match the CFI_INSTRUCTIONs we're generating either (so Dwarf info would be
broken).

This switches the order of vpush/vpop (in both prologue and epilogue,
obviously) so that the Dwarf locations are correct again.

rdar://problem/16264856

llvm-svn: 203655
2014-03-12 11:29:23 +00:00
Patrik Hagglund 1da3512166 Replace '#include ValueTypes.h' with forward declarations.
In some cases the include is pushed "downstream" (or removed if
unused).

llvm-svn: 203644
2014-03-12 08:00:24 +00:00
Hans Wennborg 14863418ed [ARM] Use DWARF register numbers for CFI directives in ELF assembly
It seems gas can't handle CFI directives with VFP register names ("d12", etc.).
This broke us trying to build Chromium for Android after 201423.

A gas bug has been filed: https://sourceware.org/bugzilla/show_bug.cgi?id=16694

compnerd suggested making this conditional on whether we're using the integrated
assembler or not. I'll look into that in a follow-up patch.

Differential Revision: http://llvm-reviews.chandlerc.com/D3049

llvm-svn: 203635
2014-03-12 03:52:34 +00:00
David Blaikie adbea1ef9f DebugInfo: Omit pubnames/pubtypes when compiling with -gmlt
llvm-svn: 203634
2014-03-12 03:34:38 +00:00
David Blaikie ce2f1cb918 DebugInfo: Do not emit pubnames/pubtypes sections if they are empty
llvm-svn: 203622
2014-03-11 23:35:06 +00:00
David Blaikie 55bb8ac74b DebugInfo: Avoid re-looking up the DwarfUnit when emitting pubnames/pubtypes
llvm-svn: 203620
2014-03-11 23:23:39 +00:00
David Blaikie 0f55e833a6 DebugInfo: Refactor emitDebugPubNames/Types into a common implementation
I could fold the callers into their one call site, but the indirection
(given how verbose choosing the section is) seemed helpful.

The use of a member function pointer's a bit "tricky", but seems limited
enough, the call sites are simple/clean/clear, and there's only one use.

llvm-svn: 203619
2014-03-11 23:18:15 +00:00
David Blaikie 2cd3c1bc3d Accept Twine's to AsmPrinter::getTempSymbol (refactoring for an incoming change)
llvm-svn: 203617
2014-03-11 23:12:08 +00:00
David Blaikie ee89a064bb DebugInfo: Remove unused labels now that we just emit DW_AT_gnu_pubnames as a flag (as of r203082)
llvm-svn: 203612
2014-03-11 22:24:33 +00:00
Saleem Abdulrasool afc50b3ed4 support: add a utility function to normalise path separators
Add a utility function to convert the Windows path separator to Unix style path
separators.  This is used by a subsequent change in clang to enable the use of
Windows SDK headers on Linux.

llvm-svn: 203611
2014-03-11 22:05:42 +00:00
Sasa Stankovic 8600ebc74d [mips] Implement NaCl sandboxing of function calls:
* Add masking instructions before indirect calls (in MC layer).
  * Align call + branch delay to the bundle end (in MC layer).

Differential Revision: http://llvm-reviews.chandlerc.com/D3032

llvm-svn: 203606
2014-03-11 21:23:40 +00:00
Rafael Espindola a063bdde8d Simplify a really complicated check for Arch == X86_64.
The function hasReliableSymbolDifference had exactly one use in the MachO
writer. It is also only true for X86_64. In fact, the comments refers to
"Darwin x86_64" and everything else, so this makes the code match the
comment.

If this is to be abstracted again, it should be a property of
TargetObjectWriter, like useAggressiveSymbolFolding.

llvm-svn: 203605
2014-03-11 21:22:57 +00:00
Rafael Espindola 83f858e578 Cleanup the interface for creating soft or hard links.
Before this patch the unix code for creating hardlinks was unused. The code
for creating symbolic links was implemented in lib/Support/LockFileManager.cpp
and the code for creating hard links in lib/Support/*/Path.inc.

The only use we have for these is in LockFileManager.cpp and it can use both
soft and hard links. Just have a create_link function that creates one or the
other depending on the platform.

llvm-svn: 203596
2014-03-11 18:40:24 +00:00
Owen Anderson 56112b999b Range-ify a loop.
llvm-svn: 203590
2014-03-11 17:37:48 +00:00
Hans Wennborg 6c37f8b985 X86: Don't generate 64-bit movd after cmpneqsd in 32-bit mode (PR19059)
This fixes the bug where we would bitcast the 64-bit floating point result
of cmpneqsd to a 64-bit integer even on 32-bit targets.

Differential Revision: http://llvm-reviews.chandlerc.com/D3009

llvm-svn: 203581
2014-03-11 15:49:24 +00:00
Ulrich Weigand fa84ac9d8a [ppc64] Patch in TOC restore code after all external function calls
When resolving a function call to an external routine, the dynamic
loader must patch the "nop" after the branch instruction to a load
that restores the TOC register.

Current code does that, but only with the *first* instance of a call
to any particular external routine, i.e. at the point where it also
allocates the call stub.  With subsequent calls to the same routine,
current code neglects to patch in the TOC restore code.  This is a
bug, and leads to corrupt TOC pointers in those cases.

Fixed by patching in restore code every time.

llvm-svn: 203580
2014-03-11 15:26:27 +00:00
Saleem Abdulrasool 0d96f3dd6e ARM: honour -f{no-,}optimize-sibling-calls
Use the options in the ARMISelLowering to control whether tail calls are
optimised or not.  Previously, this option was entirely ignored on the ARM
target and only honoured on x86.

This option is mostly useful in profiling scenarios.  The default remains that
tail call optimisations will be applied.

llvm-svn: 203577
2014-03-11 15:09:54 +00:00
Saleem Abdulrasool b720a6bab7 ARM: remove ancient -arm-tail-calls option
This option is from 2010, designed to work around a linker issue on Darwin for
ARM.  According to grosbach this is no longer an issue and this option can
safely be removed.

llvm-svn: 203576
2014-03-11 15:09:49 +00:00
Saleem Abdulrasool ec1ec1b416 ARM: enable tail call optimisation on Thumb 2
Tail call optimisation was previously disabled on all targets other than
iOS5.0+.  This enables the tail call optimisation on all Thumb 2 capable
platforms.

The test adjustments are to remove the IR hint "tail" to function invocation.
The tests were designed assuming that tail call optimisations would not kick in
which no longer holds true.

llvm-svn: 203575
2014-03-11 15:09:44 +00:00
Erik Verbruggen 3f5dcc97e0 Fix crash in PRE.
After r203553 overflow intrinsics and their non-intrinsic (normal)
instruction get hashed to the same value. This patch prevents PRE from
moving an instruction into a predecessor block, and trying to add a phi
node that gets two different types (the intrinsic result and the
non-intrinsic result), resulting in a failing assert.

llvm-svn: 203574
2014-03-11 15:07:32 +00:00
Tim Northover 445dd58aae ARM: simplify EmitAtomicBinary64
ATOMIC_STORE operations always get here as a lowered ATOMIC_SWAP, so there's no
need for any code to handle them specially.

There should be no functionality change so no tests.

llvm-svn: 203567
2014-03-11 13:19:55 +00:00
Benjamin Kramer f8502272ef Remove copy ctors that did the same thing as the default one.
The code added nothing but potentially disabled move semantics and made
types non-trivially copyable.

llvm-svn: 203563
2014-03-11 11:32:49 +00:00
Tim Northover e94a518a22 IR: add a second ordering operand to cmpxhg for failure
The syntax for "cmpxchg" should now look something like:

	cmpxchg i32* %addr, i32 42, i32 3 acquire monotonic

where the second ordering argument gives the required semantics in the case
that no exchange takes place. It should be no stronger than the first ordering
constraint and cannot be either "release" or "acq_rel" (since no store will
have taken place).

rdar://problem/15996804

llvm-svn: 203559
2014-03-11 10:48:52 +00:00
Erik Verbruggen aab3cfe023 GVN: fix hashing of extractvalue.
My last commit did not add the indexes to the hashed value for
extractvalue. Adding that back in.

llvm-svn: 203558
2014-03-11 10:21:30 +00:00
Erik Verbruggen e2d437148a GVN: merge overflow intrinsics with non-overflow instructions.
When an overflow intrinsic is followed by a non-overflow instruction,
replace the latter with an extract. For example:

  %sadd = tail call { i32, i1 } @llvm.sadd.with.overflow.i32(i32 %a, i32 %b)
  %sadd3 = add i32 %a, %b

Here the add statement will be replaced by an extract.

When an overflow intrinsic follows a non-overflow instruction, a clone
of the intrinsic is inserted before the normal instruction, which makes
it the same as the previous case. Subsequent runs of GVN can then clean
up the duplicate instructions and insert the extract.

This fixes PR8817.

llvm-svn: 203553
2014-03-11 09:36:48 +00:00
Saleem Abdulrasool 5e1780e228 Object: rename ARMV7 to ARMNT
The official specifications state the name to be ARMNT (as per the Microsoft
Portable Executable and Common Object Format Specification v8.3).

llvm-svn: 203530
2014-03-11 03:08:37 +00:00
Duncan P. N. Exon Smith cec1c2486a Cleanup whitespace
llvm-svn: 203529
2014-03-11 02:44:45 +00:00
Matt Arsenault 0211714ecb R600: Calculate store mask instead of using switch.
llvm-svn: 203527
2014-03-11 01:38:53 +00:00
Jim Grosbach c94d993adf X86: Enable ISel of 16-bit MOVBE instructions.
When the MOVBE instructions are available, use them for 16-bit endian
swapping as well as for 32 and 64 bit.

The patterns were already present on the instructions, but weren't being
matched because the operation was unconditionally marked to 'Expand.'
Change that to be conditional on whether the MOVBE instructions are
available. Use 'rolw' to implement the in-register version (32 and 64
bit have the dedicated 'bswap' instruction for that).

Patch by Louis Gerbarg <lgg@apple.com>.

rdar://15479984

llvm-svn: 203524
2014-03-11 00:44:14 +00:00
Evan Cheng bf371db951 Follow up to r203488. Code clean up to eliminate a lot of copy+paste.
llvm-svn: 203520
2014-03-11 00:24:20 +00:00
Matt Arsenault faa297e89e Remove incomplete comment
llvm-svn: 203518
2014-03-11 00:01:37 +00:00
Matt Arsenault 6dde30354a Move trivial getter into header.
llvm-svn: 203517
2014-03-11 00:01:34 +00:00
Matt Arsenault 9504d2f269 Use .data() instead of &x[0]
llvm-svn: 203516
2014-03-11 00:01:31 +00:00
Matt Arsenault e1f1da30f4 Fix indentation
llvm-svn: 203515
2014-03-11 00:01:27 +00:00
Matt Arsenault 95b714c749 Fix non 2-space indentation.
llvm-svn: 203514
2014-03-11 00:01:25 +00:00
Duncan P. N. Exon Smith 56cc990480 Module: Don't rename in getOrInsertFunction()
During LTO, user-supplied definitions of C library functions often
exist.  -instcombine uses Module::getOrInsertFunction() to get a handle
on library functions (e.g., @puts, when optimizing @printf).

Previously, Module::getOrInsertFunction() would rename any matching
functions with local linkage, and create a new declaration.  In LTO,
this is the opposite of desired behaviour, as it skips by the
user-supplied version of the library function and creates a new
undefined reference which the linker often cannot resolve.

After some discussing with Rafael on the list, it looks like it's
undesired behaviour.  If a consumer actually *needs* this behaviour, we
should add new API with a more explicit name.

I added two testcases: one specifically for the -instcombine behaviour
and one for the LTO flow.

<rdar://problem/16165191>

llvm-svn: 203513
2014-03-10 23:42:28 +00:00
Raul E. Silvera ce376c0fcb When analyzing vectors of element type that require legalization,
the legalization cost must be included to get an accurate
estimation of the total cost of the scalarized vector.
The inaccurate cost triggered unprofitable SLP vectorization on
32-bit X86.

Summary:
Include legalization overhead when computing scalarization cost

Reviewers: hfinkel, nadav

CC: chandlerc, rnk, llvm-commits

Differential Revision: http://llvm-reviews.chandlerc.com/D2992

llvm-svn: 203509
2014-03-10 22:59:13 +00:00
Diego Novillo 92aa8c220a Use discriminator information in sample profiles.
Summary:
When the sample profiles include discriminator information,
use the discriminator values to distinguish instruction weights
in different basic blocks.

This modifies the BodySamples mapping to map <line, discriminator> pairs
to weights. Instructions on the same line but different blocks, will
use different discriminator values. This, in turn, means that the blocks
may have different weights.

Other changes in this patch:

- Add tests for positive values of line offset, discriminator and samples.
- Change data types from uint32_t to unsigned and int and do additional
  validation.

Reviewers: chandlerc

CC: llvm-commits

Differential Revision: http://llvm-reviews.chandlerc.com/D2857

llvm-svn: 203508
2014-03-10 22:41:28 +00:00
Justin Bogner 28e1cf6061 IR: Slightly more verbose error in Verifier
Extend the error message generated by the Verifier when an intrinsic
name does not match the expected mangling to include the expected
name.  Simplifies debugging.

Patch by Philip Reames!

llvm-svn: 203490
2014-03-10 21:22:44 +00:00
Benjamin Kramer 3ef5e46b6d MemCpyOpt: When merging memsets also merge the trivial case of two memsets with the same destination.
The testcase is from PR19092, but I think the bug described there is actually a clang issue.

llvm-svn: 203489
2014-03-10 21:05:13 +00:00
Evan Cheng 0e8f4612a9 For functions with ARM target specific calling convention, when simplify-libcall
optimize a call to a llvm intrinsic to something that invovles a call to a C
library call, make sure it sets the right calling convention on the call.

e.g.
extern double pow(double, double);
double t(double x) {
  return pow(10, x);
}

Compiles to something like this for AAPCS-VFP:
define arm_aapcs_vfpcc double @t(double %x) #0 {
entry:
  %0 = call double @llvm.pow.f64(double 1.000000e+01, double %x)
  ret double %0
}

declare double @llvm.pow.f64(double, double) #1

Simplify libcall (part of instcombine) will turn the above into:
define arm_aapcs_vfpcc double @t(double %x) #0 {
entry:
  %__exp10 = call double @__exp10(double %x) #1
  ret double %__exp10
}

declare double @__exp10(double)

The pre-instcombine code works because calls to LLVM builtins are special.
Instruction selection will chose the right calling convention for the call.
However, the code after instcombine is wrong. The call to __exp10 will use
the C calling convention.

I can think of 3 options to fix this.

1. Make "C" calling convention just work since the target should know what CC
   is being used.

   This doesn't work because each function can use different CC with the "pcs"
   attribute.

2. Have Clang add the right CC keyword on the calls to LLVM builtin.

   This will work but it doesn't match the LLVM IR specification which states
   these are "Standard C Library Intrinsics".

3. Fix simplify libcall so the resulting calls to the C routines will have the
   proper CC keyword. e.g.
   %__exp10 = call arm_aapcs_vfpcc double @__exp10(double %x) #1

   This works and is the solution I implemented here.

Both solutions #2 and #3 would work. After carefully considering the pros and
cons, I decided to implement #3 for the following reasons.

1. It doesn't change the "spec" of the intrinsics.
2. It's a self-contained fix.

There are a couple of potential downsides.
1. There could be other places in the optimizer that is broken in the same way
   that's not addressed by this.
2. There could be other calling conventions that need to be propagated by
   simplify-libcall that's not handled.

But for now, this is the fix that I'm most comfortable with.

llvm-svn: 203488
2014-03-10 20:49:45 +00:00
Sasa Stankovic 5fddf61089 [mips] Implement NaCl sandboxing of loads, stores and SP changes:
* Add masking instructions before loads and stores (in MC layer).
  * Add masking instructions after SP changes (in MC layer).
  * Forbid loads, stores and SP changes in delay slots (in MI layer).

Differential Revision: http://llvm-reviews.chandlerc.com/D2904

llvm-svn: 203484
2014-03-10 20:34:23 +00:00
Eli Bendersky e78ae059b5 Make sure NVPTX doesn't emit symbol names that aren't valid in PTX.
NVPTX, like the other backends, relies on generic symbol name sanitizing done by
MCSymbol. However, the ptxas assembler is more stringent and disallows some
additional characters in symbol names.

See PR19099 for more details.

llvm-svn: 203483
2014-03-10 20:05:42 +00:00
Tim Northover ad96d012c3 llvm-c: expose unnamedaddr field of globals
Patch by Manuel Jacob.

llvm-svn: 203482
2014-03-10 19:24:35 +00:00
Reed Kotler 96b7402bac Fix regression with -O0 for mips .
llvm-svn: 203469
2014-03-10 16:31:25 +00:00
Benjamin Kramer 3ad5c96268 [C++11] Modernize the IR library a bit.
No functionality change.

llvm-svn: 203465
2014-03-10 15:03:06 +00:00
Daniel Sanders 059e4b158c [mips][fp64] Add an implicit def to MFHC1 claiming that it reads the lower 32-bits of 64-bit FPR
Summary:
This is a white lie to workaround a widespread bug in the -mfp64
implementation.

The problem is that none of the 32-bit fpu ops mention the fact that they
clobber the upper 32-bits of the 64-bit FPR. This allows MFHC1 to be
scheduled on the wrong side of most 32-bit FPU ops. Fixing that requires a
major overhaul of the FPU implementation which can't be done right now due to
time constraints.

MFHC1 is one of two affected instructions. These instructions are the only
FPU instructions that don't read or write the lower 32-bits. We therefore
pretend that it reads the bottom 32-bits to artificially create a dependency and
prevent the scheduler changing the behaviour of the code.
The other instruction is MTHC1 which will be fixed once I've have found a failing
test case for it. 

The testcase is test-suite/SingleSource/UnitTests/Vector/simple.c when
given TARGET_CFLAGS="-mips32r2 -mfp64 -mmsa".

Reviewers: jacksprat, matheusalmeida

Reviewed By: jacksprat

Differential Revision: http://llvm-reviews.chandlerc.com/D2966

llvm-svn: 203464
2014-03-10 15:01:57 +00:00
Matheus Almeida 64459d296b [mips] Assembly parser must invoke the target streamer to handle .set reorder macro.
llvm-svn: 203459
2014-03-10 13:21:10 +00:00
Tim Northover 2a661f3f73 AArch64: fix LowerCONCAT_VECTORS for new CodeGen.
The function was making too many assumptions about its input:

1. The NEON_VDUP optimisation was far too aggressive, assuming (I
think) that the input would always be BUILD_VECTOR.

2. We were treating most unknown concats as legal (by returning Op
rather than SDValue()). I think only concats of pairs of vectors are
actually legal.

http://llvm.org/PR19094

llvm-svn: 203450
2014-03-10 09:34:07 +00:00
Craig Topper 24e685fdb0 [C++11] Remove 'virtual' keyword from methods marked with 'override' keyword.
llvm-svn: 203444
2014-03-10 05:29:18 +00:00
Craig Topper 6ff5aa7c87 [C++11] Remove 'virtual' keyword from methods marked with 'override' keyword.
llvm-svn: 203442
2014-03-10 03:53:12 +00:00
Chandler Carruth e42bafece1 [AArch64] Fix a use of uninitialized memory introduced in r203125,
and caught by the MSan bootstrap build bot. This should hopefully get
the bot green at long last.

llvm-svn: 203441
2014-03-10 03:52:47 +00:00
Craig Topper d25ff6f917 De-virtualize a method since it doesn't override anything and isn't overridden itself.
llvm-svn: 203440
2014-03-10 03:22:59 +00:00
Craig Topper ca7e3e5c4b [C++11] Add 'override' keyword to virtual methods that override their base class.
llvm-svn: 203439
2014-03-10 03:19:03 +00:00
Chandler Carruth aee3ca6cfd [TTI] There is actually no realistic way to pop TTI implementations off
the stack of the analysis group because they are all immutable passes.
This is made clear by Craig's recent work to use override
systematically -- we weren't overriding anything for 'finalizePass'
because there is no such thing.

This is kind of a lame restriction on the API -- we can no longer push
and pop things, we just set up the stack and run. However, I'm not
invested in building some better solution on top of the existing
(terrifying) immutable pass and legacy pass manager.

llvm-svn: 203437
2014-03-10 02:45:14 +00:00
Chandler Carruth e9b50617b8 [LCG] Ran clang-format over this too and it pointed out some fixes.
llvm-svn: 203435
2014-03-10 02:14:14 +00:00
Chandler Carruth 5ae74a6af2 [PM] While I'm here, fix a few other clang-format issues. Pulls some
lines under 80-columns, etc.

llvm-svn: 203434
2014-03-10 02:12:14 +00:00
Craig Topper 6bc27bf359 [C++11] Add 'override' keyword to virtual methods that override their base class.
llvm-svn: 203433
2014-03-10 02:09:33 +00:00