Commit Graph

111755 Commits

Author SHA1 Message Date
Ulrich Weigand 6b577e26f0 Use the integrated assembler as default on PowerPC
This was already done in clang, this commit now uses the integrated
assembler as default when using LLVM tools directly.

A number of test cases using inline asm had to be adapted, either by
updating the expected output, or by using -no-integrated-as (for such
tests that deliberately use an invalid instruction in inline asm).

llvm-svn: 225819
2015-01-13 19:43:45 +00:00
Chris Bieneman 5d23224f21 Running clang-format on CommandLine.h and CommandLine.cpp.
No functional changes, I'm just going to be doing a lot of work in these files and it would be helpful if they had more current LLVM style.

llvm-svn: 225817
2015-01-13 19:14:20 +00:00
Peter Collingbourne 943d270c81 Add link to Go bindings documentation.
llvm-svn: 225815
2015-01-13 18:49:42 +00:00
Hal Finkel 63fb928109 Revert "r225808 - [PowerPC] Add StackMap/PatchPoint support"
Reverting this while I investiage buildbot failures (segfaulting in
GetCostForDef at ScheduleDAGRRList.cpp:314).

llvm-svn: 225811
2015-01-13 18:25:05 +00:00
Will Schmidt 4a2d333982 Update multiline.ll testcase to handle (ppc64le) .localentry directive
The ppc64le platform will emit a .localentry directive. This is triggering
a false-positive against a CHECK-NOT: .loc in multiline.ll.
Add a space "{{ }}" to the check-not line to allow for arguments, and
prevent .localentry from matching.


Differential Revision: http://reviews.llvm.org/D6935

llvm-svn: 225810
2015-01-13 18:17:08 +00:00
Hal Finkel 76a31f8c12 [PowerPC] Add missing override keyword
llvm-svn: 225809
2015-01-13 18:02:22 +00:00
Hal Finkel 821befd52b [PowerPC] Add StackMap/PatchPoint support
This commit does two things:

 1. Refactors PPCFastISel to use more of the common infrastructure for call
    lowering (this lets us take advantage of this common code for lowering some
    common intrinsics, stackmap/patchpoint among them).

 2. Adds support for stackmap/patchpoint lowering. For the most part, this is
    very similar to the support in the AArch64 target, with the obvious differences
    (different registers, NOP instructions, etc.). The test cases are adapted
    from the AArch64 test cases.

One difference of note is that the patchpoint call sequence takes 24 bytes, so
you can't use less than that (on AArch64 you can go down to 16). Also, as noted
in the docs, we take the patchpoint address to be the actual code address
(assuming the call is local in the TOC-sharing sense), which should yield
higher performance than generating the full cross-DSO indirect-call sequence
and is likely just as useful for JITed code (if not, we'll change it).

StackMaps and Patchpoints are still marked as experimental, and so this support
is doubly experimental. So go ahead and experiment!

llvm-svn: 225808
2015-01-13 17:48:12 +00:00
Hal Finkel c4ee2c5188 [StackMaps] Use CurrentFnSymForSize
When computing the call-site offset, use AP.CurrentFnSymForSize instead of
AP.CurrentFnSym. There should be no change for other targets, but this is
necessary for generating valid expressions for PPC64/ELF.

llvm-svn: 225807
2015-01-13 17:48:07 +00:00
Hal Finkel 0ad96c818c [StackMaps] Mark in CallLoweringInfo when lowering a patchpoint
While, generally speaking, the process of lowering arguments for a patchpoint
is the same as lowering a regular indirect call, on some targets it may not be
exactly the same. Targets may not, for example, want to add additional register
dependencies that apply only to making cross-DSO calls through linker stubs,
may not want to load additional registers out of function descriptors, and may
not want to add additional side-effect-causing instructions that cannot be
removed later with the call itself being generated.

The PowerPC target will use this in a future commit (for all of the reasons
stated above).

llvm-svn: 225806
2015-01-13 17:48:04 +00:00
Hal Finkel df87f9383b [StackMaps] Allow the target to pre-process the live-out mask
Some targets, PowerPC for example, have pseudo-registers (such as that used to
represent the rounding mode), that don't have DWARF register numbers or a
register class. These are used only for internal dependency tracking, and
should not appear in the recorded live-outs. This adds a callback allowing the
target to pre-process the live-out mask in order to remove these kinds of
registers so that the StackMaps code does not complain about them and/or
attempt to include them in the output.

This will be used by the PowerPC target in a future commit.

llvm-svn: 225805
2015-01-13 17:47:59 +00:00
Hal Finkel f4a22c0d48 [PowerPC] Split the blr definition into BLR and BLR8
We really need a separate 64-bit version of this instruction so that it can be
marked as clobbering LR8 (instead of just LR). No change in functionality
(although the verifier might be slightly happier), however, it is required for
stackmap/patchpoint support. Thus, this will be covered by stackmap test cases
once those are added.

llvm-svn: 225804
2015-01-13 17:47:54 +00:00
Hal Finkel 7d3d50bcb2 [PowerPC] Add DWARF numbers for CA (XER), etc.
For registers that have DWARF numbers (like CA, which is really part of XER),
add them. Also, RM is not an SPR, and the declaration hack (where it is
declared as an SPR with an arbitrary number) is not needed, so just declare it
as a register.

NFC; although CA's register number will be needed when stackmap/patchpoint
support is added.

llvm-svn: 225800
2015-01-13 17:45:11 +00:00
Jozef Kolek e7cad7a1df [mips][microMIPS] Fix issue with 16b instructions in jr instruction delay slot
16 bit instructions are not allowed in jr delay slot. Same stands for
PseudoIndirectBranch and PseudoReturn.

Differential Revision: http://reviews.llvm.org/D6815

llvm-svn: 225798
2015-01-13 15:59:17 +00:00
Daniel Sanders bd1d69add6 Added a Mips lld milestone to the release notes for the 3.6 release.
llvm-svn: 225797
2015-01-13 15:17:00 +00:00
Olivier Sallenave 325096980b Added TLI hook for isFPExtFree. Some of the FMA combine heuristics are now guarded with that hook.
llvm-svn: 225795
2015-01-13 15:06:36 +00:00
Erik Eckstein a168ef753f Revert "SLPVectorizer: Cache results from memory alias checking."
The alias cache has a problem of incorrect collisions in case a new instruction is allocated at the same address as a previously deleted instruction.

llvm-svn: 225790
2015-01-13 14:36:46 +00:00
Aaron Ballman 8bd9897730 Silence warnings about unknown pragmas for compilers that are not Clang. NFC.
llvm-svn: 225788
2015-01-13 14:30:07 +00:00
Peter Zotov 343991dd8b [OCaml] Allow out-of-tree builds of LLVM bindings.
In order to use this feature, configure LLVM as usual,
but then build and install it as:

   make all install SYSTEM_LLVM_CONFIG=llvm-config

where llvm-config is the llvm-config binary installed on your
system (possibly llvm-config-VERSION on e.g. Debian).

llvm-svn: 225787
2015-01-13 12:17:56 +00:00
Erik Eckstein 4a445c047f SLPVectorizer: Cache results from memory alias checking.
This speeds up the dependency calculations for blocks with many load/store/call instructions.
Beside the improved runtime, there is no functional change.

llvm-svn: 225786
2015-01-13 11:37:51 +00:00
Chandler Carruth 0550e11fd0 [PM] In the PassManager template, remove a pointless indirection through
a nested class template for the PassModel, and use the T-suffix for the
two typedefs to match the code in the AnalysisManager.

This is the last of the fairly fundamental code cleanups here. Will be
focusing on the printing of analyses next to finish that aspect off.

llvm-svn: 225785
2015-01-13 11:36:43 +00:00
Chandler Carruth 3e498400da [PM] Remove the 'AnalysisManagerT' type parameter from numerous layers
of templates in the new pass manager.

The analysis manager is now itself just a template predicated on the IR
unit. This makes lots of the templates really trivial and more clear:
they are all parameterized on a single type, the IR unit's type.
Everything else is a function of that. To me, this is a really nice
cleanup of the APIs and removes a layer of 'magic' and 'indirection'
that really wasn't there and just got in the way of understanding what
is going on here.

llvm-svn: 225784
2015-01-13 11:31:43 +00:00
Chandler Carruth 816702ffe0 [PM] Refactor the new pass manager to use a single template to implement
the generic functionality of the pass managers themselves.

In the new infrastructure, the pass "manager" isn't actually interesting
at all. It just pipelines a single chunk of IR through N passes. We
don't need to know anything about the IR or the passes to do this really
and we can replace the 3 implementations of the exact same functionality
with a single generic PassManager template, complementing the single
generic AnalysisManager template.

I've left typedefs in place to give convenient names to the various
obvious instantiations of the template.

With this, I think I've nuked almost all of the redundant logic in the
managers, and I think the overall design is actually simpler for having
single templates that clearly indicate there is no special logic here.
The logging is made somewhat more annoying by this change, but I don't
think the difference is worth having heavy-weight traits to help log
things.

llvm-svn: 225783
2015-01-13 11:13:56 +00:00
Peter Zotov d1136297d3 Update release notes wrt OCaml bindings.
llvm-svn: 225779
2015-01-13 09:48:02 +00:00
Peter Zotov 1f00ac9368 [OCaml] Use $CAMLORIGIN, an rpath-$ORIGIN-like mechanism in OCaml.
As a result, installations of LLVM in non-standard locations
will not require passing custom -ccopt -L flags when building
the binary, nor absolute paths would be embedded in the cma/cmxa
files. Additionally, the executables will not require changes
to LD_LIBRARY_PATH, although CAML_LD_LIBRARY_PATH still
has to be set for ocamlc without -custom.

See http://caml.inria.fr/mantis/view.php?id=6642.
Note that the patch is approved, but not merged yet.
It will be released in 4.03 and likely 4.02.

llvm-svn: 225778
2015-01-13 09:47:59 +00:00
NAKAMURA Takumi 2f8f0547b1 IR/MetadataTest.cpp: Appease msc17 to avoid initializer list.
llvm-svn: 225775
2015-01-13 08:13:46 +00:00
Mehdi Amini 22e59748ef Peephole opt needs optimizeSelect() to keep track of newly created MIs
Peephole optimizer is scanning a basic block forward. At some point it 
needs to answer the question "given a pointer to an MI in the current 
BB, is it located before or after the current instruction".
To perform this, it keeps a set of the MIs already seen during the scan, 
if a MI is not in the set, it is assumed to be after.
It means that newly created MIs have to be inserted in the set as well.

This commit passes the set as an argument to the target-dependent 
optimizeSelect() so that it can properly update the set with the 
(potentially) newly created MIs.

llvm-svn: 225772
2015-01-13 07:07:13 +00:00
Ramkumar Ramachandra 181233b2b7 fix {typo, build failure} in r225760
llvm-svn: 225762
2015-01-13 04:17:47 +00:00
Ramkumar Ramachandra 40c3e03e27 Standardize {pred,succ,use,user}_empty()
The functions {pred,succ,use,user}_{begin,end} exist, but many users
have to check *_begin() with *_end() by hand to determine if the
BasicBlock or User is empty. Fix this with a standard *_empty(),
demonstrating a few usecases.

llvm-svn: 225760
2015-01-13 03:46:47 +00:00
Saleem Abdulrasool faa4f074eb ARM: prepare prefix parsing for improved AAELF support
AAELF specifies a number of ELF specific relocation types which have custom
prefixes for the symbol reference.  Switch the parser to be more table driven
with an idea of file formats for which they apply.  NFC.

llvm-svn: 225758
2015-01-13 03:22:49 +00:00
Chandler Carruth 7ad6d620b7 [PM] Fold all three analysis managers into a single AnalysisManager
template.

This consolidates three copies of nearly the same core logic. It adds
"complexity" to the ModuleAnalysisManager in that it makes it possible
to share a ModuleAnalysisManager across multiple modules... But it does
so by deleting *all of the code*, so I'm OK with that. This will
naturally make fixing bugs in this code much simpler, etc.

The only down side here is that we have to use 'typename' and 'this->'
in various places, and the implementation is lifted into the header.
I'll take that for the code size reduction.

The convenient names are still typedef-ed and used throughout so that
users can largely ignore this aspect of the implementation.

The follow-up change to this will do the exact same refactoring for the
PassManagers. =D

It turns out that the interesting different code is almost entirely in
the adaptors. At the end, that should be essentially all that is left.

llvm-svn: 225757
2015-01-13 02:51:47 +00:00
Richard Trieu 5dc76a5d34 Disable a warning for self move since the test is checking for this behavior.
llvm-svn: 225754
2015-01-13 02:10:33 +00:00
Sanjay Patel db8e6f472e fix typo; NFC
llvm-svn: 225753
2015-01-13 01:51:52 +00:00
Reid Kleckner 3542ace6ef Rename llvm.recoverframeallocation to llvm.framerecover
This name is less descriptive, but it sort of puts things in the
'llvm.frame...' namespace, relating it to frameallocate and
frameaddress. It also avoids using "allocate" and "allocation" together.

llvm-svn: 225752
2015-01-13 01:51:34 +00:00
Chandler Carruth 759d960ce4 [PM] Fix another place where I was using an overly generic T&& for the
IR unit to directly use IRUnitT& for now.

llvm-svn: 225750
2015-01-13 01:44:56 +00:00
Duncan P. N. Exon Smith 1e05ea6173 IR: Use unique_ptr, NFC
Use `std::unique_ptr<>`, as suggested by David Blaikie.

llvm-svn: 225749
2015-01-13 00:57:27 +00:00
Paul Robinson 6f4a19f1bd Phabricator calls it "subscriber" not "cc"
llvm-svn: 225747
2015-01-13 00:50:31 +00:00
Reid Kleckner e9b8931873 Add the llvm.frameallocate and llvm.recoverframeallocation intrinsics
These intrinsics allow multiple functions to share a single stack
allocation from one function's call frame. The function with the
allocation may only perform one allocation, and it must be in the entry
block.

Functions accessing the allocation call llvm.recoverframeallocation with
the function whose frame they are accessing and a frame pointer from an
active call frame of that function.

These intrinsics are very difficult to inline correctly, so the
intention is that they be introduced rarely, or at least very late
during EH preparation.

Reviewers: echristo, andrew.w.kaylor

Differential Revision: http://reviews.llvm.org/D6493

llvm-svn: 225746
2015-01-13 00:48:10 +00:00
Duncan P. N. Exon Smith 845755c4bb IR: Remove an invalid assertion when replacing resolved operands
This adds back the testcase from r225738, and adds to it.  Looks like we
need both sides for now (the assertion was incorrect both ways, and
although it seemed reasonable (when written correctly) it wasn't
particularly important).

llvm-svn: 225745
2015-01-13 00:46:34 +00:00
Matt Arsenault a982e4f82b Combine fcmp + select to fminnum / fmaxnum if no nans and legal
Also require unsafe FP math for no since there isn't a way to
test for signed zeros.

llvm-svn: 225744
2015-01-13 00:43:00 +00:00
Chandler Carruth 2e7522e9ce [PM] Re-clang-format much of this code as the code has changed some and
so has clang-format. Notably, this fixes a bunch of formatting in the
CGSCC pass manager side of things that has been improved in clang-format
recently.

llvm-svn: 225743
2015-01-13 00:36:47 +00:00
Duncan P. N. Exon Smith 2cc792b1d1 Revert "IR: Fix an inverted assertion when replacing resolved operands"
This reverts commit r225738.  Maybe the assertion is just plain wrong,
but this version fails on WAY more bots.  I'll make sure both ways work
in a follow-up but I want to get bots green in the meantime.

llvm-svn: 225742
2015-01-13 00:34:21 +00:00
Eric Christopher acf25766ad Grammar and spelling.
llvm-svn: 225740
2015-01-13 00:21:14 +00:00
Duncan P. N. Exon Smith e4c842f816 IR: Fix an inverted assertion when replacing resolved operands
Add a unit test, since this bug was only exposed by clang tests.  Thanks
to Rafael for tracking this down!

llvm-svn: 225738
2015-01-13 00:10:38 +00:00
Hans Wennborg 427e1214b4 Release merge script: don't actually commit the merge
Instead, just present the command for committing it. This way,
the user can test the merge locally, resolve conflicts, etc.
before committing, which seems much safer to me.

llvm-svn: 225737
2015-01-13 00:07:31 +00:00
Hans Wennborg 68aaa4daf5 Release tag script: add -revision option
It seems useful to be able to create the branch at a revision that looks good
on the buildbots.

llvm-svn: 225736
2015-01-13 00:07:29 +00:00
Hans Wennborg 332c4b7ee9 Release tag script: add -dry-run flag
llvm-svn: 225735
2015-01-13 00:07:22 +00:00
Adrian Prantl 66f2595845 Debug Info: Move support for constants into DwarfExpression.
Move the declaration of DebugLocDwarfExpression into DwarfExpression.h
because it needs to be accessed from AsmPrinterDwarf.cpp and DwarfDebug.cpp

NFC.

llvm-svn: 225734
2015-01-13 00:04:06 +00:00
Duncan P. N. Exon Smith a6de6a4013 IR: Split out writeMDTuple(), NFC
Prepare for more subclasses of `UniquableMDNode` than `MDTuple`.

llvm-svn: 225732
2015-01-12 23:45:31 +00:00
Adrian Prantl a4c30d6509 Make DwarfExpression store the AsmPrinter instead of the TargetMachine.
NFC.

llvm-svn: 225731
2015-01-12 23:36:56 +00:00
Adrian Prantl 9cffbd8daa remove extra semicolon
llvm-svn: 225730
2015-01-12 23:36:50 +00:00
Reid Kleckner bba20f06de musttail: Only set the inreg flag for fastcall and vectorcall
Otherwise we'll attempt to forward ECX, EDX, and EAX for cdecl and
stdcall thunks, leaving us with no scratch registers for indirect call
targets.

Fixes PR22052.

llvm-svn: 225729
2015-01-12 23:28:23 +00:00
Matt Arsenault 64dae8354b R600/SI: Remove redundant setting expand on f64 vectors
None of these are legal types already, so they default to
Expand.

llvm-svn: 225728
2015-01-12 23:13:00 +00:00
Duncan P. N. Exon Smith b3ff1cbc08 IR: Unbreak the MSVC build after r225689
llvm-svn: 225727
2015-01-12 23:09:14 +00:00
Adrian Prantl 337e360279 Run clang-format on the parts of AsmPrinterDwarf where it improves the
readability.

llvm-svn: 225726
2015-01-12 23:03:23 +00:00
Adrian Prantl 0fec811d7b Debug Info: Add a virtual destructor to DwarfExpression.
Thanks Chandler for noticing!

llvm-svn: 225724
2015-01-12 22:59:28 +00:00
Chandler Carruth 2482fe0b52 [PM] Sink the reference vs. value decision for IR units out of the
templated interface.

So far, every single IR unit I can come up with has address-identity.
That is, when two units of IR are both active in LLVM, their addresses
will be distinct of the IR is distinct. This is clearly true for
Modules, Functions, BasicBlocks, and Instructions. It turns out that the
only practical way to make the CGSCC stuff work the way we want is to
make it true for SCCs as well. I expect this pattern to continue.

When first designing the pass manager code, I kept this dimension of
freedom in the type parameters, essentially allowing for a wrapper-type
whose address did not form identity. But that really no longer makes
sense and is making the code more complex or subtle for no gain. If we
ever have an actual use case for this, we can figure out what makes
sense then and there. It will be better because then we will have the
actual example in hand.

While the simplifications afforded in this patch are fairly small
(mostly sinking the '&' out of many type parameters onto a few
interfaces), it would have become much more pronounced with subsequent
changes. I have a sequence of changes that will completely remove the
code duplication that currently exists between all of the pass managers
and analysis managers. =] Should make things much cleaner and avoid bug
fixing N times for the N pass managers.

llvm-svn: 225723
2015-01-12 22:53:31 +00:00
Duncan P. N. Exon Smith 3f0ad4a80e IR: Remove incorrect comment, NFC
llvm-svn: 225722
2015-01-12 22:53:18 +00:00
Duncan P. N. Exon Smith 380902ef6b IR: Fix unit test memory leak reported by ASan
http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/603/steps/check-llvm%20asan/logs/stdio

Thanks Alexey for pointing me to this!

llvm-svn: 225721
2015-01-12 22:46:15 +00:00
Adrian Prantl 0d5df0ac1c Untwine this expression. Thanks to David for noticing!
llvm-svn: 225720
2015-01-12 22:39:14 +00:00
Simon Pilgrim d88ab87064 [X86][SSE] Minor regression fix for r225551
r225551 vector byte shuffle optimization caused an assertion as fully zeroable vectors can be produced under certain circumstances. This fix drops the assert and returns a zero vector where the assert would have failed.

llvm-svn: 225718
2015-01-12 22:38:08 +00:00
Adrian Prantl 0e6ffb9d0d Debug Info: Implement DwarfUnit::addRegisterOpPiece() using DwarfExpression.
NFC.

llvm-svn: 225717
2015-01-12 22:37:16 +00:00
Duncan P. N. Exon Smith 49503f827d Bitcode: Range-based for, NFC
llvm-svn: 225716
2015-01-12 22:35:34 +00:00
Duncan P. N. Exon Smith b1ad5d39a9 Bitcode: Add abbreviation for METADATA_NAME
llvm-svn: 225715
2015-01-12 22:34:10 +00:00
Duncan P. N. Exon Smith f8dd6ad6de Bitcode: Range-based for, NFC
llvm-svn: 225714
2015-01-12 22:33:00 +00:00
Duncan P. N. Exon Smith 73d5aae74c Bitcode: Range-based for, NFC
llvm-svn: 225713
2015-01-12 22:31:35 +00:00
Duncan P. N. Exon Smith 2fcf60e78e Bitcode: Simplify emission of METADATA_BLOCK
Refactor logic so that we know up-front whether to open a block and
whether we need an MDString abbreviation.

This is almost NFC, but will start emitting `MDString` abbreviations
when the first record is not an `MDString`.

llvm-svn: 225712
2015-01-12 22:30:34 +00:00
Duncan P. N. Exon Smith 0b31dd1d67 AsmParser: Use subclass API instead of MDNode wrappers, NFC
Use subclass API instead of the wrappers in `MDNode` in the assembly
parser.  This will make the code easier to follow once we have multiple
subclasses.

llvm-svn: 225711
2015-01-12 22:27:39 +00:00
Duncan P. N. Exon Smith f825dae836 AsmParser: Factor duplicated code into ParseMDNode(), NFC
llvm-svn: 225710
2015-01-12 22:26:48 +00:00
Duncan P. N. Exon Smith 62a7919f6b AsmParser: Reorder ParseMetadata() logic, NFC
llvm-svn: 225709
2015-01-12 22:24:50 +00:00
Duncan P. N. Exon Smith dbcff30bd1 AsmParser: Simplify ParseMDTuple(), NFC
llvm-svn: 225708
2015-01-12 22:23:04 +00:00
Adrian Prantl 00dbc2a7d3 Debug Info: Implement DwarfUnit::addRegisterOffset using DwarfExpression.
No functional change.

llvm-svn: 225707
2015-01-12 22:19:26 +00:00
Adrian Prantl b16d9ebb0c Debug info: Factor out the creation of DWARF expressions from AsmPrinter
into a new class DwarfExpression that can be shared between AsmPrinter
and DwarfUnit.

This is the first step towards unifying the two entirely redundant
implementations of dwarf expression emission in DwarfUnit and AsmPrinter.

Almost no functional change — Testcases were updated because asm comments
that used to be on two lines now appear on the same line, which is
actually preferable.

llvm-svn: 225706
2015-01-12 22:19:22 +00:00
Duncan P. N. Exon Smith 58ef9d142a AsmParser: ParseMDNode() => ParseMDTuple(), NFC
This isn't parsing arbitrary subclasses of `MDNode`, just `MDTuple`.

llvm-svn: 225702
2015-01-12 21:23:11 +00:00
Sanjay Patel 06d5589a84 80-cols; NFC
llvm-svn: 225700
2015-01-12 21:21:28 +00:00
Duncan P. N. Exon Smith a8d9a026d9 AsmParser: Remove unused version of ParseMDNodeID()
Merge the two versions of `ParseMDNodeID()` now that no one needs
special forward references.

llvm-svn: 225699
2015-01-12 21:14:38 +00:00
Duncan P. N. Exon Smith ab617d5977 AsmParser: Use normal references for metadata attachments, NFC
Remove special parsing logic for metadata attachments.  Now that
`DebugLoc` is stored normally (since the metadata/value split), we don't
need this special forward referencing logic.

llvm-svn: 225698
2015-01-12 21:13:09 +00:00
Duncan P. N. Exon Smith bf68e80d06 IR: Prepare for a new UniquableMDNode subclass, NFC
Add generic dispatch for the parts of `UniquableMDNode` that cast to
`MDTuple`.  This makes adding other subclasses (like PR21433's
`MDLocation`) easier.

llvm-svn: 225697
2015-01-12 20:56:33 +00:00
Duncan P. N. Exon Smith 6b1f4659f9 IR: Stop erasing MDNodes from uniquing sets during teardown
Stop erasing `MDNode`s from the uniquing sets in `LLVMContextImpl`
during teardown (in particular, during
`UniquableMDNode::~UniquableMDNode()`).  Although it's currently
feasible, there isn't any clear benefit and it may not be feasible for
other subclasses (which don't explicitly store the lookup hash).

llvm-svn: 225696
2015-01-12 20:50:25 +00:00
Bill Schmidt a2dece27e4 First crack at PowerPC 3.6 release notes
llvm-svn: 225695
2015-01-12 20:46:43 +00:00
Eric Fiselier ffbadedcf9 [LIT] Remove string decoding in gtest discovery code. lit.util.capture now does decoding.
llvm-svn: 225693
2015-01-12 20:43:34 +00:00
Ahmed Bougacha 291833b959 [X86] Also create+widen FMIN/FMAX nodes for v2f32.
This happens in the HINT benchmark, where the SLP-vectorizer created
v2f32 fcmp/select code.  The "correct" solution would have been to
teach the vectorizer cost model that v2f32 isn't legal (because really,
it isn't), but if we can vectorize we might as well do so.

We legalize these v2f32 FMIN/FMAX nodes by widening to v4f32 later on.
v3f32 were already widened to v4f32 by the generic unroll-and-build-vector
legalization.

rdar://15763436
Differential Revision: http://reviews.llvm.org/D6557

llvm-svn: 225691
2015-01-12 20:31:30 +00:00
Duncan P. N. Exon Smith 942623540b IR: Move creation logic to MDNodeFwdDecl, NFC
Same as with `MDTuple`, factor out a `friend MDNode` by moving creation
logic to the concrete subclass.

llvm-svn: 225690
2015-01-12 20:21:37 +00:00
Duncan P. N. Exon Smith b565b10956 IR: Make MDNodeFwdDecl destructor public
Now that the leak detector is gone, anyone can call this.

llvm-svn: 225689
2015-01-12 20:19:54 +00:00
Ahmed Bougacha 66fde538ee [X86] Make SSE min/max testcases more explicit. NFC.
llvm-svn: 225687
2015-01-12 20:15:47 +00:00
Duncan P. N. Exon Smith ac3128d901 IR: Move creation logic down to MDTuple, NFC
Move creation logic for `MDTuple`s down where it belongs.  Once there
are a few more subclasses, these functions really won't make much sense
here (the `friend` relationship was already awkward).  For now, leave
the `MDNode` versions around, but have it forward down.

llvm-svn: 225685
2015-01-12 20:13:56 +00:00
Duncan P. N. Exon Smith 3c94844a48 IR: Push storeDistinctInContext() down to UniquableMDNode, NFC
llvm-svn: 225683
2015-01-12 20:11:32 +00:00
Duncan P. N. Exon Smith 118632dbf6 IR: Split GenericMDNode into MDTuple and UniquableMDNode
Split `GenericMDNode` into two classes (with more descriptive names).

  - `UniquableMDNode` will be a common subclass for `MDNode`s that are
    sometimes uniqued like constants, and sometimes 'distinct'.

    This class gets the (short-lived) RAUW support and related API.

  - `MDTuple` is the basic tuple that has always been returned by
    `MDNode::get()`.  This is as opposed to more specific nodes to be
    added soon, which have additional fields, custom assembly syntax,
    and extra semantics.

    This class gets the hash-related logic, since other sublcasses of
    `UniquableMDNode` may need to hash based on other fields.

To keep this diff from getting too big, I've added casts to `MDTuple`
that won't really scale as new subclasses of `UniquableMDNode` are
added, but I'll clean those up incrementally.

(No functionality change intended.)

llvm-svn: 225682
2015-01-12 20:09:34 +00:00
Eric Fiselier 30045e6148 [LIT] Decode string result in lit.util.capture
Summary: I think this is probably a bug, but I'm putting this up for review just to be sure. I think that `lit.util.capture` should decode the resulting string in the same way `lit.util.executeCommand` does.

Reviewers: ddunbar, EricWF

Reviewed By: EricWF

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D6769

llvm-svn: 225681
2015-01-12 20:09:34 +00:00
Duncan P. N. Exon Smith 0c87d77175 IR: Invert logic to simplify control flow, NFC
llvm-svn: 225670
2015-01-12 19:45:44 +00:00
Duncan P. N. Exon Smith 34c3d10363 IR: Separate out decrementUnresolvedOperandCount(), NFC
llvm-svn: 225667
2015-01-12 19:43:15 +00:00
Duncan P. N. Exon Smith d9e6eb7108 IR: Prevent handleChangedOperand() recursion
Instead of returning early on `handleChangedOperand()` recursion
(finally identified (and test added) in r225657), prevent it upfront by
releasing operands before RAUW.

Aside from massively different program flow, there should be no
functionality change ;).

llvm-svn: 225665
2015-01-12 19:36:35 +00:00
Tom Stellard b6550529a6 R600/SI: Use RegisterOperands to specify which operands can accept immediates
There are some operands which can take either immediates or registers
and we were previously using different register class to distinguish
between operands that could take immediates and those that could not.

This patch switches to using RegisterOperands which should simplify the
backend by reducing the number of register classes and also make it
easier to implement the assembler.

llvm-svn: 225662
2015-01-12 19:33:18 +00:00
Tom Stellard 89b26108b1 Target: Allow target specific operand types
This adds two new fields to the RegisterOperand TableGen class:

string OperandNamespace = "MCOI";
string OperandType = "OPERAND_REGISTER";

These fields can be used to specify a target specific operand type,
which will be stored in the OperandType member of the MCOperandInfo
object.

This can be useful for targets that need to store some extra information
about operands that cannot be expressed using the target independent
types.  For example, in the R600 backend, there are operands which
can take either registers or immediates and it is convenient to be able
to specify this in the TableGen definitions.

llvm-svn: 225661
2015-01-12 19:33:09 +00:00
Sanjay Patel 5f1d9eaad3 GVN: propagate equalities for floating point compares
Allow optimizations based on FP comparison values in the same way
as integers. 

This resolves PR17713:
http://llvm.org/bugs/show_bug.cgi?id=17713

Differential Revision: http://reviews.llvm.org/D6911

llvm-svn: 225660
2015-01-12 19:29:48 +00:00
Duncan P. N. Exon Smith 5f4618923c IR: Add test for handleChangedOperand() recursion
Turns out this can happen.  Remove the `FIXME` and add a testcase that
crashes without the extra logic.

llvm-svn: 225657
2015-01-12 19:22:04 +00:00
Duncan P. N. Exon Smith 967629e14a IR: Separate out recalculateHash(), NFC
llvm-svn: 225655
2015-01-12 19:16:34 +00:00
Duncan P. N. Exon Smith 3a16d80a44 IR: Separate out helper: resolveAfterOperandChange(), NFC
llvm-svn: 225654
2015-01-12 19:14:15 +00:00
Duncan P. N. Exon Smith 5c5710b890 IR: Use SubclassData32 directly, NFC
Simplify some logic by accessing `SubclassData32` directly instead of
relying on API.

llvm-svn: 225653
2015-01-12 19:12:37 +00:00
Matthias Braun f5d931f716 RegisterCoalescer: Turn some impossible conditions into asserts
This is a fixed version of reverted r225500. It fixes the too early
if() continue; of the last patch and adds a comment to the unorthodox
loop.

llvm-svn: 225652
2015-01-12 19:10:17 +00:00
Duncan P. N. Exon Smith 6c0aee3248 IR: Don't allow operands to become unresolved
Operands shouldn't change from being resolved to unresolved during graph
construction.  Simplify the logic based on that assumption.

llvm-svn: 225649
2015-01-12 18:59:40 +00:00
Duncan P. N. Exon Smith c0286874d7 IR: Remove redundant comment, NFC
llvm-svn: 225648
2015-01-12 18:45:32 +00:00
Duncan P. N. Exon Smith 686162b1bc IR: Simplify code, NFC
llvm-svn: 225647
2015-01-12 18:45:01 +00:00
Duncan P. N. Exon Smith d1474eea71 IR: Make temporary nodes distinct
Change the return of `MDNode::isDistinct()` for `MDNode::getTemporary()`
to `true`.  They aren't uniqued.

llvm-svn: 225646
2015-01-12 18:41:26 +00:00
Rafael Espindola d9c3e308f5 Add r224985 back with two fixes.
One is that AArch64 has additional restrictions on when local relocations can
be used. We have to take those into consideration when deciding to put a L
symbol in the symbol table or not.

The other is that ld64 requires the relocations to cstring to use linker
visible symbols on AArch64.

Thanks to Michael Zolotukhin for testing this!

Remove doesSectionRequireSymbols.

In an assembly expression like

bar:
.long L0 + 1

the intended semantics is that bar will contain a pointer one byte past L0.

In sections that are merged by content (strings, 4 byte constants, etc), a
single position in the section doesn't give the linker enough information.
For example, it would not be able to tell a relocation must point to the
end of a string, since that would look just like the start of the next.

The solution used in ELF to use relocation with symbols if there is a non-zero
addend.

In MachO before this patch we would just keep all symbols in some sections.

This would miss some cases (only cstrings on x86_64 were implemented) and was
inefficient since most relocations have an addend of 0 and can be represented
without the symbol.

This patch implements the non-zero addend logic for MachO too.

llvm-svn: 225644
2015-01-12 18:13:07 +00:00
Duncan P. N. Exon Smith daa335a9c2 IR: Simplify replaceOperandWith(), NFC
This will call `handleChangedOperand()` less frequently, but in that
case (i.e., `isStoredDistinctInContext()`) it has identical logic to
here.

llvm-svn: 225643
2015-01-12 18:01:45 +00:00
Duncan P. N. Exon Smith 54df9896e3 IR: Remove redundant calls to MDNode::setHash(), NFC
`storeDistinctInContext()` already calls `setHash(0)`.

llvm-svn: 225642
2015-01-12 17:57:38 +00:00
Timur Iskhodzhanov 00ede84084 [ASan] Move the shadow on Windows 32-bit from 0x20000000 to 0x40000000
llvm-svn: 225641
2015-01-12 17:38:58 +00:00
Ahmed Bougacha e03bef7543 [SimplifyLibCalls] Factor out fortified libcall handling.
This lets us remove CGP duplicate.

Differential Revision: http://reviews.llvm.org/D6541

llvm-svn: 225640
2015-01-12 17:22:43 +00:00
Ahmed Bougacha 6722f5e5b3 [SimplifyLibCalls] Factor out str/mem libcall optimizations.
Put them in a separate function, so we can reuse them to further
simplify fortified libcalls as well.

Differential Revision: http://reviews.llvm.org/D6540

llvm-svn: 225639
2015-01-12 17:20:06 +00:00
Ahmed Bougacha b7d8afb6c5 [SimplifyLibCalls] Factor out signature checks for fortifiable libcalls.
The checks are the same for fortified counterparts to the libcalls, so
we might as well do them in a single place.

Differential Revision: http://reviews.llvm.org/D6539

llvm-svn: 225638
2015-01-12 17:18:19 +00:00
Jozef Kolek 9761e96b01 [mips][microMIPS] Implement BEQZ16 and BNEZ16 instructions
Differential Revision: http://reviews.llvm.org/D5271

llvm-svn: 225627
2015-01-12 12:03:34 +00:00
Richard Smith 600ee4ad66 Put this test's input in the Inputs directory where it belongs, rather than
reusing a file from a different test directory.

llvm-svn: 225621
2015-01-12 08:50:47 +00:00
Chandler Carruth 06a5dd69e2 Add a new utility script that helps update very simple regression tests.
This script is currently specific to x86 and limited to use with very
small regression or feature tests using 'llc' and 'FileCheck' in
a reasonably canonical way. It is in no way general purpose or robust at
this point. However, it works quite well for simple examples. Here is
the intended workflow:

- Make a change that requires updating N test files and M functions'
  assertions within those files.
- Stash the change.
- Update those N test files' RUN-lines to look "canonical"[1].
- Refresh the FileCheck lines for either the entire file or select
  functions by running this script.
  - The script will parse the RUN lines and run the 'llc' binary you
    give it according to each line, collecting the asm.
  - It will then annotate each function with the appropriate FileCheck
    comments to check every instruction from the start of the first
    basic block to the last return.
  - There will be numerous cases where the script either fails to remove
    the old lines, or inserts checks which need to be manually editted,
    but the manual edits tend to be deletions or replacements of
    registers with FileCheck variables which are fast manual edits.
  - A common pattern is to have the script insert complete checking of
    every instruction, and then edit it down to only check the relevant
    ones.
  - Be careful to do all of these cleanups though! The script is
    designed to make transferring and formatting the asm output of llc
    into a test case fast, it is *not* designed to be authoratitive
    about what constitutes a good test!
- Commit the nice fresh baseline of checks.
- Unstash your change and rebuild llc.
- Re-run script to regenerate the FileCheck annotations
  - Remember to re-cleanup these annotations!!!
- Check the diff to make sure this is sane, checking the things you
  expected it to, and check that the newly updated tests actually pass.
- Profit!

Also, I'm *terrible* at writing Python, and frankly I didn't spend a lot
of time making this script beautiful or well engineered. But it's useful
to me and may be useful to others so I thought I'd send it out.

http://reviews.llvm.org/D5546

llvm-svn: 225618
2015-01-12 04:43:18 +00:00
Hal Finkel 87deb0b8e3 [PowerPC] Fix calls to non-function objects
Looking at r225438 inspired me to see how the PowerPC backend handled the
situation (calling a bitcasted TLS global), and it turns out we also produced
an error (cannot select ...). What it means to "call" something that is not a
function is implementation and platform specific, but in the name of doing
something (besides crashing), this makes sure we do what GCC does (treat all
such calls as calls through a function pointer -- meaning that the pointer is
assumed, as is the convention on PPC, to point to a function descriptor
structure holding the actual code address along with the function's TOC pointer
and environment pointer). As GCC does, we now do the same for calling regular
(non-TLS) non-function globals too.

I'm not sure whether this is the most useful way to define the behavior, but at
least we won't be alone.

llvm-svn: 225617
2015-01-12 04:34:47 +00:00
Simon Pilgrim b5869f6c7c [X86][SSE] Minor fix to VPBLENDW AVX2 commutation.
D6015 / rL221313 enabled commutation for SSE immediate blend instructions, but due to a typo the AVX2 VPBLENDW ymm instructions weren't flagged as commutative along with the others in the tables, but were still being commuted in code and tested for.

llvm-svn: 225612
2015-01-11 22:08:01 +00:00
Daniel Sanders 122e7cd1cf Fix silly mistake in release notes for Mips.
llvm-svn: 225608
2015-01-11 10:48:20 +00:00
Daniel Sanders 1bcd70e794 Added release notes for the Mips target.
llvm-svn: 225607
2015-01-11 10:34:52 +00:00
David Majnemer 14141f941a Revert most of r225597
We can't rely on a DataLayout enlightened constant folder.

llvm-svn: 225599
2015-01-11 07:29:51 +00:00
David Majnemer 292d0c796b X86: Properly decode shuffle masks when the constant pool type is weird
It's possible for the constant pool entry for the shuffle mask to come
from a completely different operation.  This occurs when Constants have
the same bit pattern but have different types.

Make DecodePSHUFBMask tolerant of types which, after a bitcast, are
appropriately sized vector types.

This fixes PR22188.

llvm-svn: 225597
2015-01-11 05:08:57 +00:00
Saleem Abdulrasool 9cf2679d3b X86: teach X86TargetLowering about L,M,O constraints
Teach the ISelLowering for X86 about the L,M,O target specific constraints.
Although, for the moment, clang performs constraint validation and prevents
passing along inline asm which may have immediate constant constraints violated,
the backend should be able to cope with the invalid inline asm a bit better.

llvm-svn: 225596
2015-01-11 04:39:24 +00:00
Saleem Abdulrasool fe781977b9 ARM: add support for segment base relocations (SBREL)
This adds support for parsing and emitting the SBREL relocation variant for the
ARM target.  Handling this relocation variant is necessary for supporting the
full ARM ELF specification.  Addresses PR22128.

llvm-svn: 225595
2015-01-11 04:39:18 +00:00
Chandler Carruth c491f72e7a [x86] Remove some windows line endings that snuck into the tests here.
Folks on Windows, remember to set up your subversion to strip these when
submitting...

llvm-svn: 225593
2015-01-11 01:36:20 +00:00
Chandler Carruth ffc5c1f3b8 [ADT] Remove the unused default constructor for iterator_range.
This default constructor is a bit weird. It left the range in an invalid
state. That might be reasonable so that you can construct a local
iterator range and assign to it based on some logic to compute the range
you want. If folks would like to support that use case, I can add it
back, but in 238-odd usages none have actually wanted to do this. ;]

llvm-svn: 225592
2015-01-11 01:16:26 +00:00
Sanjoy Das 81401d4b19 Fix PR22179.
We were incorrectly inferring nsw for certain SCEVs. We can be more
aggressive here (see Richard Smith's comment on
http://llvm.org/bugs/show_bug.cgi?id=22179) but this change just
focuses on correctness.

Differential Revision: http://reviews.llvm.org/D6914

llvm-svn: 225591
2015-01-10 23:41:24 +00:00
Joerg Sonnenberger 8a36a8e5d4 Revert r225500, it leads to infinite loops.
llvm-svn: 225590
2015-01-10 21:49:36 +00:00
Simon Pilgrim 94a4cc027a [X86][SSE] Improved (v)insertps shuffle matching
In the current code we only attempt to match against insertps if we have exactly one element from the second input vector, irrespective of how much of the shuffle result is zeroable.

This patch checks to see if there is a single non-zeroable element from either input that requires insertion. It also supports matching of cases where only one of the inputs need to be referenced.

We also split insertps shuffle matching off into a new lowerVectorShuffleAsInsertPS function.

Differential Revision: http://reviews.llvm.org/D6879

llvm-svn: 225589
2015-01-10 19:45:33 +00:00
Ramkumar Ramachandra 9be98b6bef .gitignore: add some rules for tagging programs
Often, we miss committing new files, and 'arc diff' is supposed to warn
us about this. Unfortunately, because of the spurious output of the
command (due to unignored untracked files), we tend to ignore it and
lose information.

llvm-svn: 225588
2015-01-10 19:11:29 +00:00
Hal Finkel 5d5d1539cc [PowerPC] Mark zext of a small scalar load as free
This initial implementation of PPCTargetLowering::isZExtFree marks as free
zexts of small scalar loads (that are not sign-extending). This callback is
used by SelectionDAGBuilder's RegsForValue::getCopyToRegs, and thus to
determine whether a zext or an anyext is used to lower illegally-typed PHIs.
Because later truncates of zero-extended values are nops, this allows for the
elimination of later unnecessary truncations.

Fixes the initial complaint associated with PR22120.

llvm-svn: 225584
2015-01-10 08:21:59 +00:00
Justin Hibbits 17744c1e0d Remove some whitespace.
llvm-svn: 225583
2015-01-10 07:50:31 +00:00
Dmitri Gribenko cbc7ae25da ConvertUTFTest: fix misleading empty line
llvm-svn: 225580
2015-01-10 05:03:29 +00:00
Saleem Abdulrasool c552218e28 tests: fix previous commit
The previous commit accidentally missed changes to the test output checking,
resulting in an errant failure.

llvm-svn: 225577
2015-01-10 02:53:25 +00:00
Saleem Abdulrasool 48bbb6c821 test: merge ARM relocations test
There is a fair number of relocations that are part of the AAELF specification.
Simply merge the tests into a single test file, otherwise, we will end up with
far too many test files to test each relocation type.  NFC.

llvm-svn: 225576
2015-01-10 02:48:29 +00:00
Saleem Abdulrasool ff2da70fdd tests: convert a couple of ARM relocation tests to readobj
These tests are checking the relocation generation.  Use the readobj output as
it is much easier to follow when glancing over the tests.

llvm-svn: 225575
2015-01-10 02:48:25 +00:00
Justin Hibbits 654346e6f9 Fully fix Bug #22115.
Summary:
In the previous commit, the register was saved, but space was not allocated.
This resulted in the parameter save area potentially clobbering r30, leading to
nasty results.

Test Plan: Tests updated

Reviewers: hfinkel

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D6906

llvm-svn: 225573
2015-01-10 01:57:21 +00:00
Alexey Samsonov 7c8a725116 Fix undefined behavior (shift of negative value) in RuntimeDyldMachOAArch64::encodeAddend.
Test Plan: regression test suite with/without UBSan.

Reviewers: lhames, ributzka

Subscribers: aemerson, llvm-commits

Differential Revision: http://reviews.llvm.org/D6908

llvm-svn: 225568
2015-01-10 00:46:38 +00:00
Hal Finkel 611b127ad8 [PowerPC] Readjust the loop unrolling threshold
Now that the way that the partial unrolling threshold for small loops is used
to compute the unrolling factor as been corrected, a slightly smaller threshold
is preferable. This is expected; other targets may need to re-tune as well.

llvm-svn: 225566
2015-01-10 00:31:10 +00:00
Hal Finkel 38dd590861 [LoopUnroll] Fix the partial unrolling threshold for small loop sizes
When we compute the size of a loop, we include the branch on the backedge and
the comparison feeding the conditional branch. Under normal circumstances,
these don't get replicated with the rest of the loop body when we unroll. This
led to the somewhat surprising behavior that really small loops would not get
unrolled enough -- they could be unrolled more and the resulting loop would be
below the threshold, because we were assuming they'd take
(LoopSize * UnrollingFactor) instructions after unrolling, instead of
(((LoopSize-2) * UnrollingFactor)+2) instructions. This fixes that computation.

llvm-svn: 225565
2015-01-10 00:30:55 +00:00
Rafael Espindola d0b23bef6f Use the DiagnosticHandler to print diagnostics when reading bitcode.
The bitcode reading interface used std::error_code to report an error to the
callers and it is the callers job to print diagnostics.

This is not ideal for error handling or diagnostic reporting:

* For error handling, all that the callers care about is 3 possibilities:
  * It worked
  * The bitcode file is corrupted/invalid.
  * The file is not bitcode at all.

* For diagnostic, it is user friendly to include far more information
  about the invalid case so the user can find out what is wrong with the
  bitcode file. This comes up, for example, when a developer introduces a
  bug while extending the format.

The compromise we had was to have a lot of error codes.

With this patch we use the DiagnosticHandler to communicate with the
human and std::error_code to communicate with the caller.

This allows us to have far fewer error codes and adds the infrastructure to
print better diagnostics. This is so because the diagnostics are printed when
he issue is found. The code that detected the problem in alive in the stack and
can pass down as much context as needed. As an example the patch updates
test/Bitcode/invalid.ll.

Using a DiagnosticHandler also moves the fatal/non-fatal error decision to the
caller. A simple one like llvm-dis can just use fatal errors. The gold plugin
needs a bit more complex treatment because of being passed non-bitcode files. An
hypothetical interactive tool would make all bitcode errors non-fatal.

llvm-svn: 225562
2015-01-10 00:07:30 +00:00
Alexey Samsonov 29e464f0df Fix UBSan error reports in ValueMapCallbackVH and AssertingVH<T> empty/tombstone keys generation.
Summary:
One more attempt to fix UBSan reports: make sure DenseMapInfo::getEmptyKey()
and DenseMapInfo::getTombstoneKey() doesn't do any upcasts/downcasts to/from Value*.

Test Plan: check-llvm test suite with/without UBSan bootstrap

Reviewers: chandlerc, dexonsmith

Subscribers: llvm-commits, majnemer

Differential Revision: http://reviews.llvm.org/D6903

llvm-svn: 225558
2015-01-09 23:17:25 +00:00
Alexey Samsonov 55acbc071c Disable Go bindings test under UBSan.
llvm-svn: 225557
2015-01-09 23:17:23 +00:00
Andrew Kaylor a10379ad49 Fix the JIT event listeners and replace the associated tests.
The changes to EventListenerCommon.h were contributed by Arch Robison.

This fixes bug 22095.

http://reviews.llvm.org/D6905

llvm-svn: 225554
2015-01-09 22:53:24 +00:00
Michael Zolotukhin d9ade185b9 Update comment.
llvm-svn: 225553
2015-01-09 22:15:06 +00:00
Hans Wennborg dcc6e5bc03 SimplifyCFG: check uses of constant-foldable instrs in switch destinations (PR20210)
The previous code assumed that such instructions could not have any uses
outside CaseDest, with the motivation that the instruction could not
dominate CommonDest because CommonDest has phi nodes in it. That simply
isn't true; e.g., CommonDest could have an edge back to itself.

llvm-svn: 225552
2015-01-09 22:13:31 +00:00
Simon Pilgrim ec1f2c2cab [X86][SSE] Avoid vector byte shuffles with zero by using pshufb to create zeros
pshufb can shuffle in zero bytes as well as bytes from a source vector - we can use this to avoid having to shuffle 2 vectors and ORing the result when the used inputs from a vector are all zeroable.

Differential Revision: http://reviews.llvm.org/D6878

llvm-svn: 225551
2015-01-09 22:03:19 +00:00
Kevin Enderby 0512bd75f7 Fix an ASAN failure introduced with r225537 (adding the -universal-headers to llvm-obdump).
And a fly by fix to some formatting issues with the same commit.

llvm-svn: 225550
2015-01-09 21:55:03 +00:00
Rafael Espindola 1ea49d1bdd Add a testcase of llvm-lto error handling.
llvm-svn: 225545
2015-01-09 20:55:09 +00:00
Michael Zolotukhin 1c38bc12de Remove duplicating code. NFC.
The removed condition is checked in the previous loop.

llvm-svn: 225542
2015-01-09 20:36:19 +00:00
Kevin Enderby 131d1770f6 Add the option, -universal-headers, used with -macho to print the Mach-O universal headers to llvm-objdump.
llvm-svn: 225537
2015-01-09 19:22:37 +00:00
Tim Northover eb16112e97 Re-reapply r221924: "[GVN] Perform Scalar PRE on gep indices that feed loads before
doing Load PRE"

It's not really expected to stick around, last time it provoked a weird LTO
build failure that I can't reproduce now, and the bot logs are long gone. I'll
re-revert it if the failures recur.

Original description: Perform Scalar PRE on gep indices that feed loads before
doing Load PRE.

llvm-svn: 225536
2015-01-09 19:19:56 +00:00
Lang Hames 1e923ec122 Recommit r224935 with a fix for the ObjC++/AArch64 bug that that revision
introduced.

A test case for the bug was already committed in r225385.

Patch by Rafael Espindola.

llvm-svn: 225534
2015-01-09 18:55:42 +00:00
Duncan P. N. Exon Smith 9ed19665bb Revert "Bitcode: Move the DEBUG_LOC record to DEBUG_LOC_OLD"
This reverts commit r225498 (but leaves r225499, which was a worthy
cleanup).

My plan was to change `DEBUG_LOC` to store the `MDNode` directly rather
than its operands (patch was to go out this morning), but on reflection
it's not clear that it's strictly better.  (I had missed that the
current code is unlikely to emit the `MDNode` at all.)

Conflicts:
	lib/Bitcode/Reader/BitcodeReader.cpp (due to r225499)

llvm-svn: 225531
2015-01-09 17:53:27 +00:00
Daniel Sanders 1440bb2a26 [mips] Add support for accessing $gp as a named register.
Summary:
Mips Linux uses $gp to hold a pointer to thread info structure and accesses it
with a named register. This makes this work for LLVM.

The N32 ABI doesn't quite work yet since the frontend generates incorrect IR
for this case. It neglects to truncate the 64-bit GPR to a 32-bit value before
converting to a pointer. Given correct IR (as in the testcase in this patch),
it works correctly.

Reviewers: sstankovic, vmedic, atanasyan

Reviewed By: atanasyan

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D6893

llvm-svn: 225529
2015-01-09 17:21:30 +00:00
Sanjay Patel 2ce8169ed3 fix typos; remove names from comments; NFC
llvm-svn: 225528
2015-01-09 17:11:51 +00:00
Sanjay Patel 2a385e2494 remove names from comments; NFC
llvm-svn: 225526
2015-01-09 16:47:20 +00:00
Sanjay Patel 938e279082 fix typos; NFC
llvm-svn: 225525
2015-01-09 16:35:37 +00:00
Sanjay Patel e6e58c1a9e fix typo; NFC
llvm-svn: 225524
2015-01-09 16:29:50 +00:00
Sanjay Patel d729115fa7 more efficient use of a dyn_cast; no functional change intended
llvm-svn: 225523
2015-01-09 16:28:15 +00:00
Hal Finkel b359b735d6 [PowerPC] Enable late partial unrolling on the POWER7
The P7 benefits from not have really-small loops so that we either have
multiple dispatch groups in the loop and/or the ability to form more-full
dispatch groups during scheduling. Setting the partial unrolling threshold to
44 seems good, empirically, for the P7. Compared to using no late partial
unrolling, this yields the following test-suite speedups:

SingleSource/Benchmarks/Adobe-C++/simple_types_constant_folding
	-66.3253% +/- 24.1975%
SingleSource/Benchmarks/Misc-C++/oopack_v1p8
	-44.0169% +/- 29.4881%
SingleSource/Benchmarks/Misc/pi
	-27.8351% +/- 12.2712%
SingleSource/Benchmarks/Stanford/Bubblesort
	-30.9898% +/- 22.4647%

I've speculatively added a similar setting for the P8. Also, I've noticed that
the unroller does not quite calculate the unrolling factor correctly for really
tiny loops because it neglects to account for the fact that not every loop body
replicant contains an ending branch and counter increment. I'll fix that later.

llvm-svn: 225522
2015-01-09 15:51:16 +00:00
Toma Tabacu 68e8a9c0dd [mips] Add comment which explains why we need to change the assembler options before and after inline asm blocks. NFC.
llvm-svn: 225521
2015-01-09 15:00:30 +00:00
Suyog Sarda 85d0473650 Assumption that "VectorizedValue" will always be an Instruction is not correct.
It can be a constant or a vector argument.

ex :

define i32 @hadd(<4 x i32> %a) #0 {
entry:
  %vecext = extractelement <4 x i32> %a, i32 0
  %vecext1 = extractelement <4 x i32> %a, i32 1
  %add = add i32 %vecext, %vecext1
  %vecext2 = extractelement <4 x i32> %a, i32 2
  %add3 = add i32 %add, %vecext2
  %vecext4 = extractelement <4 x i32> %a, i32 3
  %add5 = add i32 %add3, %vecext4
  ret i32 %add5
}

llvm-svn: 225517
2015-01-09 10:23:48 +00:00
Saleem Abdulrasool b68fa3b576 ARM: add support for R_ARM_ABS16
Add support for R_ARM_ABS16 relocation mapping.  Addresses PR22156.

llvm-svn: 225510
2015-01-09 06:57:24 +00:00
Saleem Abdulrasool 3e81ecfeb6 test: add additional test for SVN r225507
Add an additional test case to ensure that we generate the relocation even if
the thumb target is used.

llvm-svn: 225509
2015-01-09 06:57:18 +00:00
Saleem Abdulrasool 3c0f78a2fc ARM: add support for R_ARM_ABS8 relocations
Add support for R_ARM_ABS8 relocation.  Addresses PR22126.

llvm-svn: 225507
2015-01-09 05:59:12 +00:00
Matthias Braun 7e87384592 RegisterCoalescer: Fix removeCopyByCommutingDef with subreg liveness
The code that eliminated additional coalescable copies in
removeCopyByCommutingDef() used MergeValueNumberInto() which internally
may merge A into B or B into A. In this case A and B had different Def
points, so we have to reset ValNo.Def to the intended one after merging.

llvm-svn: 225503
2015-01-09 03:01:31 +00:00
Matthias Braun ea399e59cf RegisterCoalescer: Some cleanup in removeCopyByCommutingDef(), NFC
llvm-svn: 225502
2015-01-09 03:01:28 +00:00
Matthias Braun 55586a2f2d RegisterCoalescer: No need to set kill flags, they are recompute later anyway
llvm-svn: 225501
2015-01-09 03:01:26 +00:00
Matthias Braun 6588b145fc RegisterCoalescer: Turn some impossible conditions into asserts
llvm-svn: 225500
2015-01-09 03:01:23 +00:00
Duncan P. N. Exon Smith 52d0f16e1b Bitcode: Share logic for last instruction, NFC
Share logic for getting the last instruction emitted.

llvm-svn: 225499
2015-01-09 02:51:45 +00:00
Duncan P. N. Exon Smith 11fae74ae5 Bitcode: Move the DEBUG_LOC record to DEBUG_LOC_OLD
Prepare to simplify the `DebugLoc` record.

llvm-svn: 225498
2015-01-09 02:48:48 +00:00
Hal Finkel 5ff00b4350 [PowerPC] Add a flag for experimenting with subreg liveness tracking
This cannot yet be enabled by default, it causes ~50 miscompiles in the test
suite.

llvm-svn: 225497
2015-01-09 02:03:11 +00:00
Hal Finkel 6c39269a4c [PowerPC] Fold [sz]ext with fp_to_int lowering where possible
On modern cores with lfiw[az]x, we can fold a sign or zero extension from i32
to i64 into the load necessary for an i64 -> fp conversion.

llvm-svn: 225493
2015-01-09 01:34:30 +00:00
Hal Finkel 0ce7f372e5 [DAGCombine] Remainder of fix to r225380 (More FMA folding opportunities)
As pointed out by Aditya (and Owen), when we elide an FP extend to form an FMA,
we need to extend the incoming operands so that the resulting node will really
be legal. This is currently enabled only for PowerPC, and it happens to work
there regardless, but this should fix the functionality for everyone else
should anyone else wish to use it.

llvm-svn: 225492
2015-01-09 01:29:29 +00:00
Chandler Carruth 685b1803ab [x86] Add a flag to control the vector shuffle legality predicates that
complements the new vector shuffle lowering code path. This flag,
naturally, is *off* because we've not tested or evaluated the results of
this at all. However, the flag will make it much easier to evaluate
whether we can be this aggressive and whether there are missing vector
shuffle lowering optimizations.

llvm-svn: 225491
2015-01-09 01:24:36 +00:00
Chandler Carruth f4ea3d3d9c Cleaup ValueHandle to no longer keep a PointerIntPair for the Value*.
This was used previously for metadata but is no longer needed there. Not
doing this simplifies ValueHandle and will make it easier to fix things
like AssertingVH's DenseMapInfo.

llvm-svn: 225487
2015-01-09 00:48:47 +00:00
Hal Finkel 33ead6f901 Partial fix to r225380 (More FMA folding opportunities)
As pointed out by Aditya (and Owen), there are two things wrong with this code.
First, it adds patterns which elide FP extends when forming FMAs, and that might
not be profitable on all targets (it belongs behind the pre-existing
aggressive-FMA-formation flag). This is fixed by this change.

Second, the resulting nodes might have operands of different types (the
extensions need to be re-added). That will be fixed in the follow-up commit.

llvm-svn: 225485
2015-01-09 00:45:54 +00:00
Philip Reames 33d7f9de33 [REFACTOR] Push logic from MemDepPrinter into getNonLocalPointerDependency
Previously, MemDepPrinter handled volatile and unordered accesses without involving MemoryDependencyAnalysis.  By making a slight tweak to the documented interface - which is respected by both callers - we can move this responsibility to MDA for the benefit of any future callers.  This is basically just cleanup.

In the future, we may decide to extend MDA's non local dependency analysis to return useful results for ordered or volatile loads.  I believe (but have not really checked in detail) that local dependency analyis does get useful results for ordered, but not volatile, loads.

llvm-svn: 225483
2015-01-09 00:26:45 +00:00
Hans Wennborg becb60ffd9 ReleaseNotes.rst: these are for 3.6
llvm-svn: 225482
2015-01-09 00:21:26 +00:00
Philip Reames 567feb98f0 [Refactor] Have getNonLocalPointerDependency take the query instruction
Previously, MemoryDependenceAnalysis::getNonLocalPointerDependency was taking a list of properties about the instruction being queried. Since I'm about to need one more property to be passed down through the infrastructure - I need to know a query instruction is non-volatile in an inner helper - fix the interface once and for all.

I also added some assertions and behaviour clarifications around volatile and ordered field accesses. At the moment, this is mostly to document expected behaviour. The only non-standard instructions which can currently reach this are atomic, but unordered, loads and stores. Neither ordered or volatile accesses can reach here.

The call in GVN is protected by an isSimple check when it first considers the load. The calls in MemDepPrinter are protected by isUnordered checks. Both utilities also check isVolatile for loads and stores.

llvm-svn: 225481
2015-01-09 00:04:22 +00:00
Duncan P. N. Exon Smith 9901034822 LangRef: Add usage points for distinct MDNodes
Omission pointed out by Sean Silva!

llvm-svn: 225479
2015-01-08 23:50:26 +00:00
Duncan P. N. Exon Smith 616b9f035c IR: Drop TODO now that PR22111 is finished
llvm-svn: 225477
2015-01-08 22:43:19 +00:00
Duncan P. N. Exon Smith 953e1a48f0 Utils: Keep distinct MDNodes distinct in MapMetadata()
Create new copies of distinct `MDNode`s instead of following the
uniquing `MDNode` logic.

Just like self-references (or other cycles), `MapMetadata()` creates a
new node.  In practice most calls use `RF_NoModuleLevelChanges`, in
which case nothing is duplicated anyway.

Part of PR22111.

llvm-svn: 225476
2015-01-08 22:42:30 +00:00
Duncan P. N. Exon Smith 090a19bd3c IR: Add 'distinct' MDNodes to bitcode and assembly
Propagate whether `MDNode`s are 'distinct' through the other types of IR
(assembly and bitcode).  This adds the `distinct` keyword to assembly.

Currently, no one actually calls `MDNode::getDistinct()`, so these nodes
only get created for:

  - self-references, which are never uniqued, and
  - nodes whose operands are replaced that hit a uniquing collision.

The concept of distinct nodes is still not quite first-class, since
distinct-ness doesn't yet survive across `MapMetadata()`.

Part of PR22111.

llvm-svn: 225474
2015-01-08 22:38:29 +00:00
Sanjay Patel 22ffa9b291 remove function names from comments; NFC
llvm-svn: 225473
2015-01-08 22:36:56 +00:00
Hal Finkel 3c0952b072 [PowerPC] Mark all instructions as non-cheap for MachineLICM
MachineLICM uses a callback named hasLowDefLatency to determine if an
instruction def operand has a 'low' latency. If all relevant operands have a
'low' latency, the instruction is considered too cheap to hoist out of loops
even in low-register-pressure situations. On PowerPC cores, both the embedded
cores and the others, there is no reason to believe that this is a good choice:
all instructions have a cost inside a loop, and hoisting them when not limited
by register pressure is a reasonable default.

llvm-svn: 225471
2015-01-08 22:11:49 +00:00
Hal Finkel 0709f5160f [MachineLICM] A command-line option to hoist even cheap instructions
Add a command-line option to enable hoisting even cheap instructions (in
low-register-pressure situations). This is turned off by default, but has
proved useful for testing purposes.

llvm-svn: 225470
2015-01-08 22:10:48 +00:00
Duncan P. N. Exon Smith e90f1165d8 CodeGen: Use handy new-fangled post-increment, NFC
Drive-by cleanup; I noticed this when reviewing the patch that became
r225466.

llvm-svn: 225468
2015-01-08 21:07:55 +00:00
Akira Hatanaka 442b40c2eb [ARM] Fix a bug in constant island pass that was triggering an assertion.
The assert was being triggered when the distance between a constant pool entry
and its user exceeded the maximally allowed distance after thumb2 branch
shortening. A padding was inserted after a thumb2 branch instruction was shrunk,
which caused the user to be out of range. This is wrong as the padding should
have been inserted by the layout algorithm so that the distance between two
instructions doesn't grow later during thumb2 instruction optimization.

This commit fixes the code in ARMConstantIslands::createNewWater to call
computeBlockSize and set BasicBlock::Unalign when a branch instruction is
inserted to create new water after a basic block. A non-zero Unalign causes
the worst-case padding to be inserted when adjustBBOffsetsAfter is called to
recompute the basic block offsets.

rdar://problem/19130476

llvm-svn: 225467
2015-01-08 20:44:50 +00:00
Duncan P. N. Exon Smith 5914a97af8 CodeGen: Use range-based for loops, NFC
Patch by Ramkumar Ramachandra!

llvm-svn: 225466
2015-01-08 20:44:33 +00:00
Matt Arsenault b935d9df4c Fix fcmp + fabs instcombines when using the intrinsic
This was only handling the libcall. This is another example
of why only the intrinsic should ever be used when it exists.

llvm-svn: 225465
2015-01-08 20:09:34 +00:00
Eric Christopher a8c6a0a03f The Kaleidoscope tutorial should be using "mcjit" for the library,
"jit" doesn't exist anymore.

llvm-svn: 225462
2015-01-08 19:07:01 +00:00
Lang Hames e89539f711 [MCJIT] Remove a few redundant MCJIT tests, and drop the extraneous datalayout
strings from the copies that remain.

llvm-svn: 225460
2015-01-08 18:52:15 +00:00
Eric Christopher 90724285a2 Make the TargetMachine in MipsSubtarget a reference rather
than a pointer to make unifying code a bit easier.

llvm-svn: 225459
2015-01-08 18:18:57 +00:00
Eric Christopher d8abc3a956 Update include - this class doesn't use the target machine, but
only the subtarget.

llvm-svn: 225458
2015-01-08 18:18:54 +00:00
Eric Christopher 1933f20aa4 Fix a couple of odd formatting issues.
llvm-svn: 225457
2015-01-08 18:18:53 +00:00
Eric Christopher 09455d94bf This routine is in InstrInfo, there's no need to access it again.
llvm-svn: 225456
2015-01-08 18:18:50 +00:00
Ahmed Bougacha d716121888 [X86] Reflow comment. NFC.
llvm-svn: 225455
2015-01-08 17:49:48 +00:00
Rafael Espindola 261d25b940 clang-format. NFC.
llvm-svn: 225454
2015-01-08 16:25:01 +00:00
Rafael Espindola dffdf14bb7 Make this test a bit stricter.
It now checks for the end of the line or the opening '{'.
While at it, remove empty comments.

llvm-svn: 225451
2015-01-08 16:11:18 +00:00
Justin Hibbits 98a532dd8e Add saving and restoring of r30 to the prologue and epilogue, respectively
Summary: The PIC additions didn't update the prologue and epilogue code to save and restore r30 (PIC base register).  This does that.

Test Plan: Tests updated.

Reviewers: hfinkel

Reviewed By: hfinkel

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D6876

llvm-svn: 225450
2015-01-08 15:47:19 +00:00
Rafael Espindola bec6af62b8 Explicitly handle LinkOnceODRAutoHideLinkage. NFC. We already have a test.
llvm-svn: 225449
2015-01-08 15:39:50 +00:00