Commit Graph

87993 Commits

Author SHA1 Message Date
Simon Pilgrim 7b2164ffe0 Fix spelling.
llvm-svn: 263266
2016-03-11 17:31:43 +00:00
Quentin Colombet dd4b137364 [IRTranslator] Translate unconditional branches.
llvm-svn: 263265
2016-03-11 17:28:03 +00:00
Quentin Colombet f9b4934d1d [MachineIRBuilder] Rework buildInstr API to maximize code reuse.
llvm-svn: 263264
2016-03-11 17:27:58 +00:00
Quentin Colombet e225e2541b [IRTranslator] Update getOrCreateVReg API to use references.
A value that we want to keep in a virtual register cannot be null.
Reflect that in the API.

llvm-svn: 263263
2016-03-11 17:27:54 +00:00
Quentin Colombet 000b580b13 [MachineIRBuilder] Rename the setter of MF for consistency with the getter.
llvm-svn: 263262
2016-03-11 17:27:51 +00:00
Quentin Colombet 91ebd71e26 [MachineIRBuilder] Rename the setter for MBB for consistency with the getter.
llvm-svn: 263261
2016-03-11 17:27:47 +00:00
Quentin Colombet 53237a9e64 [IRTranslator] Update getOrCreateBB API to use references.
A null basic block is invalid, so just pass a reference.

llvm-svn: 263260
2016-03-11 17:27:43 +00:00
Mehdi Amini 99eab3dd06 Remove PreserveNames template parameter from IRBuilder
Summary:
Following r263086, we are now relying on a flag on the Context to
discard Value names in release builds.

Reviewers: chandlerc

Subscribers: mzolotukhin, llvm-commits

Differential Revision: http://reviews.llvm.org/D18023

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 263258
2016-03-11 17:15:50 +00:00
Mehdi Amini 1e9c925182 Do not specialize IRBuilder to strip names in SROA
Summary:
Following r263086, we are replacing this by a runtime check.
More cleanup will follow on the IRBuilder itself, but I submitted
this patch separately as SROA has a fancy "prefixInserter" class
that needs extra-love.

Reviewers: chandlerc

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D18022

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 263256
2016-03-11 17:15:34 +00:00
Chad Rosier ac216fd9d5 [misched] Fix a truncation issue from r263021.
The truncation was causing the sorting algorithm to behave oddly when comparing
positive and negative offsets.  Fortunately, this doesn't currently happen in
practice and was exposed by a WIP.  Thus, I can't test this change now, but the
follow on patch will.

llvm-svn: 263255
2016-03-11 16:54:07 +00:00
Chandler Carruth ace8c8f765 [PM] Sink the "Expression" type for GVN into the class as a private
member type.

Because of how this type is used by the ValueTable, it cannot actually
have hidden visibility. GCC actually nicely warns about this but Clang
just silently ... I don't even know. =/ We should do a better job either
way though.

This should resolve a bunch of the GCC warnings about visibility that
the port of GVN triggered and make the visibility story a bit more
correct.

llvm-svn: 263250
2016-03-11 16:25:19 +00:00
Marianne Mailhot-Sarrasin 7423f40674 More UTF string conversion wrappers
Added new string conversion wrappers that convert between `std::string` (of UTF-8 bytes) and `std::wstring`, which is particularly useful for Win32 interop. Also fixed a missing string conversion for `getenv` on Win32, using these new wrappers.
The motivation behind this is to provide the support functions required for LLDB to work properly on Windows with non-ASCII data; however, the functions are not LLDB specific.

Patch by cameron314

Differential Revision: http://reviews.llvm.org/D17549

llvm-svn: 263247
2016-03-11 15:59:32 +00:00
Valery Pykhtin a7f480b4e9 [AMDGPU] Fix VOPC instruction operand namings
Differential Revision: http://reviews.llvm.org/D17966

llvm-svn: 263242
2016-03-11 14:53:28 +00:00
Simon Pilgrim 7ca9614c71 [X86][AVX] Fixed issue where a long chain of shuffles could attempt to combine to a single (illegal) PSHUFB instruction.
Its not enough that we test for SSSE3 - that's only OK for 128-bit vectors - we also need to test for AVX2 / AVX512BW for 256/512 bit vector cases.

llvm-svn: 263239
2016-03-11 14:39:10 +00:00
Chandler Carruth 5bfbc3f941 [AA] Make BasicAA just require domtree.
This doesn't change how many times we construct domtrees in the normal
pipeline, and it removes fragility and instability where basic-aa may
not be run in time to see domtrees because they happen to be constructed
afterward.

This isn't quite as clean as the change to memdep because there is
a mode where basic-aa specifically runs without domtrees -- in the
hacking version used by function-attrs with the legacy pass manager.

llvm-svn: 263234
2016-03-11 13:53:18 +00:00
Chandler Carruth aef32bd319 [memdep] Just require domtree for memdep.
This doesn't cause us to construct dominator trees any more often in the
normal pipeline, and removes an entire mode of memdep that needed to be
reasoned about and maintained. Perhaps more importantly, it removes the
ability for the results of memdep to be different because of accidental
pass scheduling goofs or the order of evaluation of 'getResult' calls.

Essentially, 'getCachedResult', unless across IR-unit boundaries, is
extremely dangerous. We need to work much harder to avoid it (or its
analog in the old pass manager).

llvm-svn: 263232
2016-03-11 13:46:00 +00:00
Chandler Carruth 3bc9c7fb45 [PM] The order of evaluation of these analyses is actually significant,
much to my horror, so use variables to fix it in place.

This terrifies me. Both basic-aa and memdep will provide more precise
information when the domtree and/or the loop info is available. Because
of this, if your pass (like GVN) requires domtree, and then queries
memdep or basic-aa, it will get more precise results. If it does this in
the other order, it gets less precise results.

All of the ideas I have for fixing this are, essentially, terrible. Here
I've just caused us to stop having unspecified behavior as different
implementations evaluate the order of these arguments differently. I'm
actually rather glad that they do, or the fragility of memdep and
basic-aa would have gone on unnoticed. I've left comments so we don't
immediately break this again. This should fix bots whose host compilers
evaluate the order of arguments differently from Clang.

llvm-svn: 263231
2016-03-11 13:26:47 +00:00
Vasileios Kalintiris e2cbc21b6f [mips] MIPSR6 Instruction itineraries
Summary: Defines instruction itineraries for common MIPSR6 instructions.

Patch by Simon Dardis.

Reviewers: vkalintiris

Subscribers: MatzeB, dsanders, llvm-commits

Differential Revision: http://reviews.llvm.org/D17198

llvm-svn: 263229
2016-03-11 13:05:06 +00:00
Daniel Sanders 78e8902097 [mips] Range check simm4.
Summary:

Reviewers: vkalintiris

Subscribers: dsanders, llvm-commits

Differential Revision: http://reviews.llvm.org/D16811

llvm-svn: 263220
2016-03-11 11:37:50 +00:00
Chandler Carruth b47f8010a9 [PM] Make the AnalysisManager parameter to run methods a reference.
This was originally a pointer to support pass managers which didn't use
AnalysisManagers. However, that doesn't realistically come up much and
the complexity of supporting it doesn't really make sense.

In fact, *many* parts of the pass manager were just assuming the pointer
was never null already. This at least makes it much more explicit and
clear.

llvm-svn: 263219
2016-03-11 11:05:24 +00:00
Chandler Carruth 30a073029c [PM] Rename the CRTP mixin base classes for the new pass manager to
clarify their purpose.

Firstly, call them "...Mixin" types so it is clear that there is no
type hierarchy being formed here. Secondly, use the term 'Info' to
clarify that they aren't adding any interesting *semantics* to the
passes or analyses, just exposing APIs used by the management layer to
get information about the pass or analysis.

Thanks to Manuel for helping pin down the naming confusion here and come
up with effective names to address it.

In case you already have some out-of-tree stuff, the following should be
roughly what you want to update:

  perl -pi -e 's/\b(Pass|Analysis)Base\b/\1InfoMixin/g'

llvm-svn: 263217
2016-03-11 10:33:22 +00:00
Chandler Carruth b4faf13c15 [PM] Implement the final conclusion as to how the analysis IDs should
work in the face of the limitations of DLLs and templated static
variables.

This requires passes that use the AnalysisBase mixin provide a static
variable themselves. So as to keep their APIs clean, I've made these
private and befriended the CRTP base class (which is the common
practice).

I've added documentation to AnalysisBase for why this is necessary and
at what point we can go back to the much simpler system.

This is clearly a better pattern than the extern template as it caught
*numerous* places where the template magic hadn't been applied and
things were "just working" but would eventually have broken
mysteriously.

llvm-svn: 263216
2016-03-11 10:22:49 +00:00
Benjamin Kramer c126353473 [InstCombine] Use Twines to generate names.
Since the names are used in a loop this does more work in debug builds. In
release builds value names are generally discarded so we don't have to do
the concatenation at all. It's also simpler code, no functional change
intended.

llvm-svn: 263215
2016-03-11 10:20:56 +00:00
Nikolay Haustov 6560781c4f [AMDGPU] Assembler: change v_madmk operands to have same order as mad.
The constant is now at source operand 1 (previously at 2).
This is also how it is in legacy AMD sp3 assembler.
Update tests.

Differential Revision: http://reviews.llvm.org/D17984

llvm-svn: 263212
2016-03-11 09:27:25 +00:00
Chandler Carruth 45a9c203a0 [PM/AA] Teach the AAManager how to handle module analyses in addition to
function analyses, and use it to wire up globals-aa to the new pass
manager.

llvm-svn: 263211
2016-03-11 09:15:11 +00:00
Chandler Carruth 89c45a162f [PM] Port GVN to the new pass manager, wire it up, and teach a couple of
tests to run GVN in both modes.

This is mostly the boring refactoring just like SROA and other complex
transformation passes. There is some trickiness in that GVN's
ValueNumber class requires hand holding to get to compile cleanly. I'm
open to suggestions about a better pattern there, but I tried several
before settling on this. I was trying to balance my desire to sink as
much implementation detail into the source file as possible without
introducing overly many layers of abstraction.

Much like with SROA, the design of this system is made somewhat more
cumbersome by the need to support both pass managers without duplicating
the significant state and logic of the pass. The same compromise is
struck here.

I've also left a FIXME in a doxygen comment as the GVN pass seems to
have pretty woeful documentation within it. I'd like to submit this with
the FIXME and let those more deeply familiar backfill the information
here now that we have a nice place in an interface to put that kind of
documentaiton.

Differential Revision: http://reviews.llvm.org/D18019

llvm-svn: 263208
2016-03-11 08:50:55 +00:00
Matt Arsenault bafc9dc591 AMDGPU: Don't use InstVisitor for AMDGPUPromoteAlloca
Frontend authors are strongly encouraged to keep allocas
in the entry block, so don't bother visiting every instruction
in the other blocks of the function.

llvm-svn: 263206
2016-03-11 08:20:50 +00:00
Matt Arsenault 6b6a2c37bc AMDGPU: R600 code splitting cleanup
Move a few functions only used by R600 to R600 specific code,
fix header macros to stop using R600, mark classes as final.

llvm-svn: 263204
2016-03-11 08:00:27 +00:00
Matt Arsenault 9a19c240c0 AMDGPU: Materialize sign bits with bfrev
If a constant is the same as the reverse of an inline immediate,
this is 4 bytes smaller than having to embed a 32-bit literal.

llvm-svn: 263201
2016-03-11 07:42:49 +00:00
Junmo Park 6098cbbd2c Minor code cleanups. NFC.
llvm-svn: 263200
2016-03-11 07:05:32 +00:00
Junmo Park 4ba6cf69e4 Minor code cleanup. NFC.
llvm-svn: 263196
2016-03-11 05:07:07 +00:00
Pete Cooper adebb9379a Remove llvm::getDISubprogram in favor of Function::getSubprogram
llvm::getDISubprogram walks the instructions in a function, looking for one in the scope of the current function, so that it can find the !dbg entry for the subprogram itself.

Now that !dbg is attached to functions, this should not be necessary. This patch changes all uses to just query the subprogram directly on the function.

Ideally this should be NFC, but in reality its possible that a function:

has no !dbg (in which case there's likely a bug somewhere in an opt pass), or
that none of the instructions had a scope referencing the function, so we used to not find the !dbg on the function but now we will

Reviewed by Duncan Exon Smith.

Differential Revision: http://reviews.llvm.org/D18074

llvm-svn: 263184
2016-03-11 02:14:16 +00:00
Adam Nemet efb234135c [LLE] Add missed LoopSimplify dependence
The code assumed that we always had a preheader without making the pass
dependent on LoopSimplify.

Thanks to Mattias Eriksson V for reporting this.

llvm-svn: 263173
2016-03-10 23:54:39 +00:00
Tim Northover 6092de5075 AArch64: only try to use scaled fcvt ops on legal vector types.
Before we ended up calling getSimpleVectorType on a <3 x float>, which
asserted.

llvm-svn: 263169
2016-03-10 23:02:21 +00:00
Sanjay Patel 0181943b89 [x86] don't use a shuffle when a vselect will do; NFCI
Looking at the IR definition of a masked load made me realize
there was no reason to use a shuffle here, so we don't need
to convert the format of the mask at all.

llvm-svn: 263167
2016-03-10 22:35:33 +00:00
Marianne Mailhot-Sarrasin eddc5b130e Test commit access
llvm-svn: 263165
2016-03-10 21:54:25 +00:00
Simon Pilgrim 8c9f00f788 Strip trailing whitespace.
llvm-svn: 263162
2016-03-10 20:58:11 +00:00
Simon Pilgrim 61eb49e437 [X86][SSE] Reapplied: Improve vector ZERO_EXTEND by combining to ZERO_EXTEND_VECTOR_INREG
Generalise the existing SIGN_EXTEND to SIGN_EXTEND_VECTOR_INREG combine to support zero extension as well and get rid of a lot of unnecessary ANY_EXTEND + mask patterns.

Reapplied with a fix for PR26870 (avoid premature use of TargetConstant in ZERO_EXTEND_VECTOR_INREG expansion).

Differential Revision: http://reviews.llvm.org/D17691

llvm-svn: 263159
2016-03-10 20:40:26 +00:00
Artur Pilipenko 3c8fc57e16 Support arbitrary addrspace pointers in masked load/store intrinsics
This patch fixes the problem which occurs when loop-vectorize tries to use @llvm.masked.load/store intrinsic for a non-default addrspace pointer. It fails with "Calling a function with a bad signature!" assertion in CallInst constructor because it tries to pass a non-default addrspace pointer to the pointer argument which has default addrspace.

The fix is to add pointer type as another overloaded type to @llvm.masked.load/store intrinsics.

Reviewed By: reames

Differential Revision: http://reviews.llvm.org/D17270

llvm-svn: 263158
2016-03-10 20:39:22 +00:00
Peter Collingbourne aba16fca5d ARM: Support relative references using the PREL31 symbol variant.
Differential Revision: http://reviews.llvm.org/D17937

llvm-svn: 263156
2016-03-10 19:30:18 +00:00
Teresa Johnson 0556e2217d Materialize metadata in IRLinker before value mapping
Summary:
Unless we plan to do later postpass metadata linking (ThinLTO special mode),
always invoke metadata materialization at the start of IRLinker::run().
This avoids the need for clients who use lazy metadata loading to
explicitly invoke materializeMetadata before the IRMover, which in
turn invokes IRLinker::run and needs materialized metadata for mapping.

Came up in the context of an LLD issue (D17982).

Reviewers: rafael

Subscribers: silvas, llvm-commits

Differential Revision: http://reviews.llvm.org/D17992

llvm-svn: 263143
2016-03-10 18:47:03 +00:00
Tim Northover 00e2dcec02 AArch64: remove pseudo-instructions used only for their patterns.
There's no real reason for these pseudos to exist, we should be writing real
patterns even if it is slightly less convenient. NFC.

llvm-svn: 263141
2016-03-10 18:46:12 +00:00
Nicolai Haehnle b142770bfe AMDGPU/SI: add llvm.amdgcn.buffer.load/store.format intrinsics
Summary:
They correspond to BUFFER_LOAD/STORE_FORMAT_XYZW and will be used by Mesa
to implement the GL_ARB_shader_image_load_store extension.

The intention is that for llvm.amdgcn.buffer.load.format, LLVM will decide
whether one of the _X/_XY/_XYZ opcodes can be used (similar to image sampling
and loads). However, this is not currently implemented.

For llvm.amdgcn.buffer.store, LLVM cannot decide to use one of the "smaller"
opcodes and therefore the intrinsic is overloaded. Currently, only the v4f32
is actually implemented since GLSL also only has a vec4 variant of the store
instructions, although it's conceivable that Mesa will want to be smarter
about this in the future.

BUFFER_LOAD_FORMAT_XYZW is already exposed via llvm.SI.vs.load.input, which
has a legacy name, pretends not to access memory, and does not capture the
full flexibility of the instruction.

Reviewers: arsenm, tstellarAMD, mareko

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D17277

llvm-svn: 263140
2016-03-10 18:43:50 +00:00
Michael Kuperstein 8be8de6d62 [X86] Correctly select registers to pop into for x86_64
When trying to replace an add to esp with pops, we need to choose dead
registers to pop into. Registers clobbered by the call and not imp-def'd
by it should be safe. Except that it's not enough to check the register
itself isn't defined, we also need to make sure no overlapping registers
are defined either.

This fixes PR26711.

Differential Revision: http://reviews.llvm.org/D18029

llvm-svn: 263139
2016-03-10 18:43:21 +00:00
Balaram Makam e9b2725287 [AArch64] Optimize compare and branch sequence when the compare's constant operand is power of 2
Summary:
Peephole optimization that generates a single TBZ/TBNZ instruction
for test and branch sequences like in the example below. This handles
the cases that miss folding of AND into TBZ/TBNZ during ISelLowering of BR_CC

Examples:
   and  w8, w8, #0x400
   cbnz w8, L1
 to
   tbnz w8, #10, L1

Reviewers: MatzeB, jmolloy, mcrosier, t.p.northover

Subscribers: aemerson, rengolin, mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D17942

llvm-svn: 263136
2016-03-10 17:54:55 +00:00
Alexandros Lamprineas 843164242e [ARM] Cortex-R8 support
This patch adds Cortex-R8 to Target Parser and TableGen.
It also adds CodeGen tests for the build attributes.

Patch by Pablo Barrio.

Differential Revision: http://reviews.llvm.org/D17925

llvm-svn: 263132
2016-03-10 17:38:41 +00:00
Mehdi Amini 1592cb9aa1 Rename -discard-value-names into -lto-discard-value-names in libLLVMLTO
This is avoiding a naming conflict with opt and llc.
While opt and llc don't link to LTO usually, users that are building a
monolithic libLLVM.dylib and linking the tools to it would have a
runtime error because of the duplicate cl::opt registration.

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 263127
2016-03-10 17:06:52 +00:00
Changpeng Fang 278a5b31a5 AMDGPU/SI: Define S_GETREG Intrinsic
Summary:
 Define s_getreg intrinsic to generate s_getreg instruction to read
hardware registers.

Reviewers: tstellarAMD, arsenm

Subscribers: llvm-commits, arsenm

Differential Revision: http://reviews.llvm.org/D17892

llvm-svn: 263124
2016-03-10 16:47:15 +00:00
Saleem Abdulrasool 1632fe1f77 ARM: follow up improvements for SVN r263118
The initial change was insufficiently complete for always getting the semantics
of __builtin_longjmp correct.  The builtin is translated into a
`tInt_eh_sjlj_longjmp` DAG node.  This node set R7 as clobbered.  However, the
code would then follow up with a clobber of R11.  I had failed to notice the
imp-def,kill on R7 in the isel.  Unfortunately, it seems that it is not possible
to conditionalise the Defs list via an !if.  Instead, construct a new parallel
WIN node and prefer that when targeting windows.  This ensures that we now both
correctly model the __builtin_longjmp as well as construct the frame in a more
ABI conformant manner.

llvm-svn: 263123
2016-03-10 16:26:37 +00:00
Chandler Carruth 37f1f12226 [SROA] Fix PR25873, which Andrea Di Biagio analyzed the daylights out
of, and I misdiagnosed for months and months.

Andrea has had a patch for this forever, but I just couldn't see how
it was fixing the root cause of the problem. It didn't make sense to me,
even though the patch was perfectly good and the analysis of the actual
failure event was *fantastic*.

Well, I came back to it today because the patch has sat for *far* too
long and needs attention and decided I wouldn't let it go until I really
understood what was going on. After quite some time in the debugger,
I finally realized that in fact I had just missed an important case with
my previous attempt to fix PR22093 in r225149. Not only do we need to
handle loads that won't be split, but stores-of-loads that we won't
split. We *do* actually have enough logic in the presplitting to form
new slices for split stores.... *unless* we decided not to split them!

I'm so sorry that it took me this long to come to the realization that
this is the issue. It seems so obvious in hind sight (of course).
Anyways, the fix becomes *much* smaller and more focused. The fact that
we're left doing integer smashing is related to the FIXME in my original
commit: fundamentally, we're not aggressive about pre-splitting for
loads and stores to the same alloca. If we want to get aggressive about
this, it'll need both what Andrea had put into the proposed fix, but
also a *lot* more logic to essentially iteratively pre-split the alloca
until we can't do any more. As I said in that commit log, its really
unclear that this is the right call. Instead, the integer blending and
letting targets lower this to narrower stores seems slightly better. But
we definitely shouldn't really go down that path just to fix this bug.

Again, tons of thanks are owed to Andrea and others at Sony for working
on this bug. I really should have seen what was going on here and
re-directed them sooner. =////

llvm-svn: 263121
2016-03-10 15:31:17 +00:00