Commit Graph

208018 Commits

Author SHA1 Message Date
Sanjay Patel 40d4eb40f6 [x86] enable machine combiner reassociations for scalar single-precision minimums
llvm-svn: 245166
2015-08-15 17:01:54 +00:00
Simon Pilgrim d65ace84c7 Updated broadcast stack folding test to avoid use of broadcast intrinsics.
llvm-svn: 245165
2015-08-15 16:54:18 +00:00
Sanjay Patel 3b7e3677e3 fix typos; NFC
llvm-svn: 245164
2015-08-15 16:53:08 +00:00
Sanjay Patel 9f6c7dddd2 add test case to show current codegen
llvm-svn: 245163
2015-08-15 16:49:50 +00:00
Davide Italiano 32cbff7809 [Sema] Be consistent about diagnostic wording: always use "cannot".
Discussed with Richard Smith.

llvm-svn: 245162
2015-08-15 15:23:14 +00:00
Yaron Keren 8b2a031cff Silence VS2015 warning.
Patch by James Touton!

http://reviews.llvm.org/D11890

llvm-svn: 245161
2015-08-15 14:54:43 +00:00
Simon Pilgrim 0750c84623 [DAGCombiner] Attempt to mask vectors before zero extension instead of after.
For cases where we TRUNCATE and then ZERO_EXTEND to a larger size (often from vector legalization), see if we can mask the source data and then ZERO_EXTEND (instead of after a ANY_EXTEND). This can help avoid having to generate a larger mask, and possibly applying it to several sub-vectors.

(zext (truncate x)) -> (zext (and(x, m))

Includes a minor patch to SystemZ to better recognise 8/16-bit zero extension patterns from RISBG bit-extraction code.

This is the first of a number of minor patches to help improve the conversion of byte masks to clear mask shuffles.

Differential Revision: http://reviews.llvm.org/D11764

llvm-svn: 245160
2015-08-15 13:27:30 +00:00
Tobias Grosser 234a48270e AST Generation Paper published in TOPLAS
The July issue of TOPLAS contains a 50 page discussion of the AST generation
techniques used in Polly. This discussion gives not only an in-depth
description of how we (re)generate an imperative AST from our polyhedral based
mathematical program description, but also gives interesting insights about:

- Schedule trees: A tree-based mathematical program description that enables us
to perform loop transformations on an abstract level, while issues like the
generation of the correct loop structure and loop bounds will be taken care of
by our AST generator.

- Polyhedral unrolling: We discuss techniques that allow the unrolling of
non-trivial loops in the context of parameteric loop bounds, complex tile
shapes and conditionally executed statements. Such unrolling support enables
the generation of predicated code e.g. in the context of GPGPU computing.

- Isolation for full/partial tile separation: We discuss native support for
handling full/partial tile separation and -- in general -- native support for
isolation of boundary cases to enable smooth code generation for core
computations.

- AST generation with modulo constraints: We discuss how modulo mappings are
lowered to efficient C/LLVM code.

- User-defined constraint sets for run-time checks We discuss how arbitrary
sets of constraints can be used to automatically create run-time checks that
ensure a set of constrainst actually hold. This feature is very useful to
verify at run-time various assumptions that have been taken program
optimization.

Polyhedral AST generation is more than scanning polyhedra
Tobias Grosser, Sven Verdoolaege, Albert Cohen
ACM Transations on Programming Languages and Systems (TOPLAS), 37(4), July 2015

llvm-svn: 245157
2015-08-15 09:34:33 +00:00
Tobias Grosser 4c45542595 Update link to Polly paper
By going through my personal website, people can go directly to the paper.

llvm-svn: 245156
2015-08-15 09:34:28 +00:00
Chandler Carruth e8824e3026 [PM/AA] Delete the LibCallAliasAnalysis and all the associated
infrastructure.

This AA was never used in tree. It's infrastructure also completely
overlaps that of TargetLibraryInfo which is used heavily by BasicAA to
achieve similar goals to those stated for this analysis.

As has come up in several discussions, the use case here is still really
important, but this code isn't helping move toward that use case. Any
progress on better supporting rich AA information for runtime library
environments would likely be better off starting from scratch or
starting from TargetLibraryInfo than from this base.

Differential Revision: http://reviews.llvm.org/D12028

llvm-svn: 245155
2015-08-15 09:22:21 +00:00
James Y Knight 2db38f33f3 Tiny cleanup: move some Triple variables up to the top of the
function, and remove a duplicate var.

llvm-svn: 245154
2015-08-15 03:45:25 +00:00
David Majnemer e888a2f655 [MS ABI] Switch catchpad/cleanuppad to use tokens
llvm-svn: 245153
2015-08-15 03:21:08 +00:00
David Majnemer ad28aaa131 [IR] Update CreateCatchRet to take a return value
llvm-svn: 245152
2015-08-15 03:19:29 +00:00
Jason Molenda a3664138dd Update DynamicRegisterInfo::SetRegisterInfo to accept eh_frame register
numbers in the key name "ehframe" or "eh_frame" in addition to the deprecated
"gcc" name (e.g. from a plugin.process.gdb-remote.target-definition-file
python file).

llvm-svn: 245151
2015-08-15 02:59:42 +00:00
Matt Arsenault 588732bd6e AMDGPU/SI: Only look at live out SGPR defs
When trying to fix SGPR live ranges, skip defs that are
killed in the same block as the def. I don't think
we need to worry about these cases as long as the
live ranges of the SGPRs in dominating blocks are
correct.

This reduces the number of elements the second
loop over the function needs to look at, and makes
it generally easier to understand. The second loop
also only considers if the live range is live
in to a block, which logically means it
must have been live out from another.

llvm-svn: 245150
2015-08-15 02:58:49 +00:00
David Majnemer 0bc0eef71c [IR] Give catchret an optional 'return value' operand
Some personality routines require funclet exit points to be clearly
marked, this is done by producing a token at the funclet pad and
consuming it at the corresponding ret instruction.  CleanupReturnInst
already had a spot for this operand but CatchReturnInst did not.
Other personality routines don't need to use this which is why it has
been made optional.

llvm-svn: 245149
2015-08-15 02:46:08 +00:00
James Y Knight 5567bafe93 Remove redundant TargetFrameLowering::getFrameIndexOffset virtual
function.

This was the same as getFrameIndexReference, but without the FrameReg
output.

Differential Revision: http://reviews.llvm.org/D12042

llvm-svn: 245148
2015-08-15 02:32:35 +00:00
NAKAMURA Takumi caad877d3e clang-tools-extra/test/clang-tidy/modernize-pass-by-value.cpp: Tweak not to override -std=c++11.
llvm-svn: 245147
2015-08-15 02:27:22 +00:00
NAKAMURA Takumi 654c2bbaf5 clang-tools-extra/test/clang-tidy/modernize-pass-by-value.cpp: Appease targeting MS to give -fno-delayed-template-parsing.
llvm-svn: 245146
2015-08-15 02:05:49 +00:00
NAKAMURA Takumi a0d39dd80a clangStaticAnalyzerCheckers: Update libdesp.
llvm-svn: 245145
2015-08-15 01:56:49 +00:00
NAKAMURA Takumi fe745cad47 clangTidyModernizeModule: Update libdeps.
llvm-svn: 245144
2015-08-15 01:32:15 +00:00
JF Bastien d4698e1bac [WebAssembly] Add Relooper
This is just an initial checkin of an implementation of the Relooper algorithm, in preparation for WebAssembly codegen to utilize. It doesn't do anything yet by itself.

The Relooper algorithm takes an arbitrary control flow graph and generates structured control flow from that, utilizing a helper variable when necessary to handle irreducibility. The WebAssembly backend will be able to use this in order to generate an AST for its binary format.

Author: azakai

Reviewers: jfb, sunfish

Subscribers: jevinskie, arsenm, jroelofs, llvm-commits

Differential revision: http://reviews.llvm.org/D11691

llvm-svn: 245142
2015-08-15 01:23:28 +00:00
Jason Molenda a18f7071c2 A messy bit of cleanup: Move towards more descriptive names
for eh_frame and stabs register numberings.  This is not
complete but it's a step in the right direction.  It's almost
entirely mechanical.

lldb informally uses "gcc register numbering" to mean eh_frame.
Why?  Probably because there's a notorious bug with gcc on i386
darwin where the register numbers in eh_frame were incorrect.
In all other cases, eh_frame register numbering is identical to
dwarf.

lldb informally uses "gdb register numbering" to mean stabs.
There are no official definitions of stabs register numbers
for different architectures, so the implementations of gdb
and gcc are the de facto reference source.

There were some incorrect uses of these register number types
in lldb already.  I fixed the ones that I saw as I made
this change.

This commit changes all references to "gcc" and "gdb" register
numbers in lldb to "eh_frame" and "stabs" to make it clear 
what is actually being represented.

lldb cannot parse the stabs debug format, and given that no
one is using stabs any more, it is unlikely that it ever will.
A more comprehensive cleanup would remove the stabs register
numbers altogether - it's unnecessary cruft / complication to
all of our register structures.

In ProcessGDBRemote, when we get register definitions from
the gdb-remote stub, we expect to see "gcc:" (qRegisterInfo)
or "gcc_regnum" (qXfer:features:read: packet to get xml payload).
This patch changes ProcessGDBRemote to also accept "ehframe:"
and "ehframe_regnum" from these remotes.

I did not change GDBRemoteCommunicationServerLLGS or debugserver
to send these new packets.  I don't know what kind of interoperability
constraints we might be working under.  At some point in the future
we should transition to using the more descriptive names.

Throughout lldb we're still using enum names like "gcc_r0" and "gdb_r0",
for eh_frame and stabs register numberings.  These should be cleaned
up eventually too.

The sources link cleanly on macosx native with xcode build.  I
don't think we'll see problems on other platforms but please let
me know if I broke anyone.

llvm-svn: 245141
2015-08-15 01:21:01 +00:00
JF Bastien 5e4303dc14 Accelerate MergeFunctions with hashing
This patch makes the Merge Functions pass faster by calculating and comparing
a hash value which captures the essential structure of a function before
performing a full function comparison.

The hash is calculated by hashing the function signature, then walking the basic
blocks of the function in the same order as the main comparison function. The
opcode of each instruction is hashed in sequence, which means that different
functions according to the existing total order cannot have the same hash, as
the comparison requires the opcodes of the two functions to be the same order.

The hash function is a static member of the FunctionComparator class because it
is tightly coupled to the exact comparison function used. For example, functions
which are equivalent modulo a single variant callsite might be merged by a more
aggressive MergeFunctions, and the hash function would need to be insensitive to
these differences in order to exploit this.

The hashing function uses a utility class which accumulates the values into an
internal state using a standard bit-mixing function. Note that this is a different interface
than a regular hashing routine, because the values to be hashed are scattered
amongst the properties of a llvm::Function, not linear in memory. This scheme is
fast because only one word of state needs to be kept, and the mixing function is
a few instructions.

The main runOnModule function first computes the hash of each function, and only
further processes functions which do not have a unique function hash. The hash
is also used to order the sorted function set. If the hashes differ, their
values are used to order the functions, otherwise the full comparison is done.

Both of these are helpful in speeding up MergeFunctions. Together they result in
speedups of 9% for mysqld (a mostly C application with little redundancy), 46%
for libxul in Firefox, and 117% for Chromium. (These are all LTO builds.) In all
three cases, the new speed of MergeFunctions is about half that of the module
verifier, making it relatively inexpensive even for large LTO builds with
hundreds of thousands of functions. The same functions are merged, so this
change is free performance.

Author: jrkoenig

Reviewers: nlewycky, dschuff, jfb

Subscribers: llvm-commits, aemerson

Differential revision: http://reviews.llvm.org/D11923

llvm-svn: 245140
2015-08-15 01:18:18 +00:00
Hans Wennborg 99000c24c9 Delay emitting members of dllexport classes until the class is fully parsed (PR23542)
This enables Clang to correctly handle code such as:

  struct __declspec(dllexport) S {
    int x = 42;
  };

where it would otherwise error due to trying to generate the default
constructor before the in-class initializer for x has been parsed.

Differential Revision: http://reviews.llvm.org/D11850

llvm-svn: 245139
2015-08-15 01:18:16 +00:00
Alex Lorenz 3a4a60cba5 MIRLangRef: Describe the syntax that is used to represent machine basic blocks.
llvm-svn: 245138
2015-08-15 01:06:06 +00:00
Matt Arsenault 427a0fd22e LoopStrengthReduce: Try to pass address space to isLegalAddressingMode
This seems to only work some of the time. In some situations,
this seems to use a nonsensical type and isn't actually aware of the
memory being accessed. e.g. if branch condition is an icmp of a pointer,
it checks the addressing mode of i1.

llvm-svn: 245137
2015-08-15 00:53:06 +00:00
Richard Smith 3938f0c728 [modules] Stop dropping 'module.timestamp' files into the current directory
when building with implicit modules disabled.

llvm-svn: 245136
2015-08-15 00:34:15 +00:00
Matt Arsenault 297ae311ce AMDGPU/SI: Fix printing useless info with amdhsa
The comments at the bottom would all report 0 if
amdhsa was used.

llvm-svn: 245135
2015-08-15 00:12:39 +00:00
Matt Arsenault 0259a7aa41 AMDGPU/SI: Update LiveVariables
This is simple but won't work if/when this pass
is moved to be post-SSA.

llvm-svn: 245134
2015-08-15 00:12:37 +00:00
Matt Arsenault 670ba46efe AMDGPU/SI: Update LiveIntervals during SIFixSGPRLiveRanges
Does not mark SlotIndexes as reserved, although I think
that might be OK.

LiveVariables still need to be handled.

llvm-svn: 245133
2015-08-15 00:12:35 +00:00
Matt Arsenault b75233235c AMDGPU: Remove unnecessary assert
These shouldn't ever be null. The number of successors
was already asserted to be 2.

llvm-svn: 245132
2015-08-15 00:12:32 +00:00
Matt Arsenault 4275c29a02 AMDGPU/SI: Make comments more precise.
True branch instructions do behave as expected with liveness.

Avoid the phrasing "branch decision is based on a value in an SGPR"
because this could be misleading. A VALU compare instruction's
result is still based on an SGPR, even though that condition
may be divergent.

llvm-svn: 245131
2015-08-15 00:12:30 +00:00
Jason Molenda 650cc3dfd6 There is no such thing as gdb_arm_f8, this register set is f0-f7.
Remove this entry and adjust the numbering for the rest of the arm
register definitions.

llvm-svn: 245130
2015-08-15 00:09:23 +00:00
Oleksiy Vyalov c24da69ebf Fix Android build.
llvm-svn: 245129
2015-08-14 23:57:15 +00:00
Zachary Turner 398f9ed95c Enable settings test for i686 as well as i386.
llvm-svn: 245128
2015-08-14 23:29:32 +00:00
Zachary Turner 793d997585 Make skipUnlessArch decorator actually skip instead of XFAIL.
llvm-svn: 245127
2015-08-14 23:29:24 +00:00
Zachary Turner 6e19fe9954 XFAIL some data formatter tests on Windows.
Fixing these bugs is tracked by http://llvm.org/pr24462.

llvm-svn: 245126
2015-08-14 23:29:17 +00:00
Zachary Turner c714b07433 Disable libstdc++ and libcxx data formatter tests on Windows.
Neither of these libraries has been ported to Windows.  Eventually
if they are ever ported we can re-enable these tests.  But more
immediately what we need to do is add new data formatters for
MSVC's STL implementation.  This is tracked in
http://llvm.org/pr24460.

llvm-svn: 245125
2015-08-14 23:28:49 +00:00
Naomi Musgrave b9b46f5a58 clarified test comment
llvm-svn: 245124
2015-08-14 23:22:03 +00:00
Nathan Wilson b20ab9245a [CONCEPTS] Add diagnostic; invalid tag when concept specified
Summary: Adding check to emit diagnostic for invalid tag when concept is specified and associated tests.

Reviewers: rsmith, hubert.reinterpretcast, fraggamuffin, faisalv, aaron.ballman

Subscribers: aaron.ballman, cfe-commits

Differential Revision: http://reviews.llvm.org/D11916

llvm-svn: 245123
2015-08-14 23:19:32 +00:00
Greg Clayton 56de8a4b56 Unbreak the windows and linux buildbots.
llvm-svn: 245122
2015-08-14 23:16:12 +00:00
Greg Clayton 360dac7d58 Don't crash if we don't have a type system for a language.
llvm-svn: 245121
2015-08-14 23:15:48 +00:00
Sanjay Patel 7332e0455f make current codegen visible in the checks, so we can decide if it's right
llvm-svn: 245120
2015-08-14 23:03:01 +00:00
Nick Lewycky 8075fd22b9 Fix a crash where a utility function wasn't aware of fcmp vectors and created a value with the wrong type. Fixes PR24458!
llvm-svn: 245119
2015-08-14 22:46:49 +00:00
Bjarke Hammersholt Roune 9791ed4705 [SCEV] Apply NSW and NUW flags via poison value analysis for sub, mul and shl
Summary:
http://reviews.llvm.org/D11212 made Scalar Evolution able to propagate NSW and NUW flags from instructions to SCEVs for add instructions. This patch expands that to sub, mul and shl instructions.

This change makes LSR able to generate pointer induction variables for loops like these, where the index is 32 bit and the pointer is 64 bit:

  for (int i = 0; i < numIterations; ++i)
    sum += ptr[i - offset];

  for (int i = 0; i < numIterations; ++i)
    sum += ptr[i * stride];

  for (int i = 0; i < numIterations; ++i)
    sum += ptr[3 * (i << 7)];


Reviewers: atrick, sanjoy

Subscribers: sanjoy, majnemer, hfinkel, llvm-commits, meheff, jingyue, eliben

Differential Revision: http://reviews.llvm.org/D11860

llvm-svn: 245118
2015-08-14 22:45:26 +00:00
Pat Gavlin b399095c3f Add a target environment for CoreCLR.
Although targeting CoreCLR is similar to targeting MSVC, there are
certain important differences that the backend must be aware of
(e.g. differences in stack probes, EH, and library calls).

Differential Revision: http://reviews.llvm.org/D11012

llvm-svn: 245115
2015-08-14 22:41:43 +00:00
Sanjay Patel dd175bc6c4 make current codegen visible in the checks, so we can decide if it's right
llvm-svn: 245108
2015-08-14 22:10:59 +00:00
Ahmed Bougacha cd35787217 [AArch64] Fix FMLS scalar-indexed-from-2s-after-neg patterns.
We canonicalize V64 vectors to V128 through insert_subvector: the other
FMLA/FMLS/FMUL/FMULX patterns match that already, but this one doesn't,
so we'd fail to match fmls and generate fneg+fmla instead.
The vector equivalents are already tested and functional.

llvm-svn: 245107
2015-08-14 22:06:05 +00:00
Evgeniy Stepanov 24ac55d884 [msan] Fix handling of musttail calls.
MSan instrumentation for return values of musttail calls is not
allowed by the IR constraints, and not needed at the same time.

llvm-svn: 245106
2015-08-14 22:03:50 +00:00