This is a very minimal move support - it leaves the moved-from object in
a zombie state that is only valid for destruction and move assignment.
This seems fine to me, and leaving it in the default constructed state
would require adding more state to the object and potentially allocating
memory (!!!) and so seems like a Bad Idea.
llvm-svn: 245192
If we can ignore NaNs, fmin/fmax libcalls can become compare and select
(this is what we turn std::min / std::max into).
This IR should then be optimized in the backend to whatever is best for
any given target. Eg, x86 can use minss/maxss instructions.
This should solve PR24314:
https://llvm.org/bugs/show_bug.cgi?id=24314
Differential Revision: http://reviews.llvm.org/D11866
llvm-svn: 245187
This test case crashes the scalar code generation as we are not
consistent with the usage of the assumed context. To be precise, we
use the assumed context for the dependence analysis but not to
restrict the domains of the statements.
A step by step explanation of the problem is given in the test case.
llvm-svn: 245176
This option allows the user to provide additional information about parameter
values as an isl_set. To specify that N has the value 1024, we can provide
the context -polly-context='[N] -> {: N = 1024}'.
llvm-svn: 245175
Bitwise arithmetic can obscure a simple sign-test. If replacing the
mask with a truncate is preferable if the type is legal because it
permits us to rephrase the comparison more explicitly.
llvm-svn: 245171
analysis ...
It turns out that we *do* need the old CallGraph ported to the new pass
manager. There are times where this model of a call graph is really
superior to the one provided by the LazyCallGraph. For example,
GlobalsModRef very specifically needs the model provided by CallGraph.
While here, I've tried to make the move semantics actually work. =]
llvm-svn: 245170
We can set additional bits in a mask given that we know the other
operand of an AND already has some bits set to zero. This can be more
efficient if doing so allows us to use an instruction which implicitly
sign extends the immediate.
This fixes PR24085.
Differential Revision: http://reviews.llvm.org/D11289
llvm-svn: 245169
ByteSize and BitSize should not be size_t but unsigned, considering
1) They are at most 2^16 and 2^19, respectively.
2) BitSize is an argument to Type::getIntNTy which takes unsigned.
Also, use the correct utostr instead itostr and cache the string result.
Thanks to James Touton for reporting this!
llvm-svn: 245167
For cases where we TRUNCATE and then ZERO_EXTEND to a larger size (often from vector legalization), see if we can mask the source data and then ZERO_EXTEND (instead of after a ANY_EXTEND). This can help avoid having to generate a larger mask, and possibly applying it to several sub-vectors.
(zext (truncate x)) -> (zext (and(x, m))
Includes a minor patch to SystemZ to better recognise 8/16-bit zero extension patterns from RISBG bit-extraction code.
This is the first of a number of minor patches to help improve the conversion of byte masks to clear mask shuffles.
Differential Revision: http://reviews.llvm.org/D11764
llvm-svn: 245160
The July issue of TOPLAS contains a 50 page discussion of the AST generation
techniques used in Polly. This discussion gives not only an in-depth
description of how we (re)generate an imperative AST from our polyhedral based
mathematical program description, but also gives interesting insights about:
- Schedule trees: A tree-based mathematical program description that enables us
to perform loop transformations on an abstract level, while issues like the
generation of the correct loop structure and loop bounds will be taken care of
by our AST generator.
- Polyhedral unrolling: We discuss techniques that allow the unrolling of
non-trivial loops in the context of parameteric loop bounds, complex tile
shapes and conditionally executed statements. Such unrolling support enables
the generation of predicated code e.g. in the context of GPGPU computing.
- Isolation for full/partial tile separation: We discuss native support for
handling full/partial tile separation and -- in general -- native support for
isolation of boundary cases to enable smooth code generation for core
computations.
- AST generation with modulo constraints: We discuss how modulo mappings are
lowered to efficient C/LLVM code.
- User-defined constraint sets for run-time checks We discuss how arbitrary
sets of constraints can be used to automatically create run-time checks that
ensure a set of constrainst actually hold. This feature is very useful to
verify at run-time various assumptions that have been taken program
optimization.
Polyhedral AST generation is more than scanning polyhedra
Tobias Grosser, Sven Verdoolaege, Albert Cohen
ACM Transations on Programming Languages and Systems (TOPLAS), 37(4), July 2015
llvm-svn: 245157
infrastructure.
This AA was never used in tree. It's infrastructure also completely
overlaps that of TargetLibraryInfo which is used heavily by BasicAA to
achieve similar goals to those stated for this analysis.
As has come up in several discussions, the use case here is still really
important, but this code isn't helping move toward that use case. Any
progress on better supporting rich AA information for runtime library
environments would likely be better off starting from scratch or
starting from TargetLibraryInfo than from this base.
Differential Revision: http://reviews.llvm.org/D12028
llvm-svn: 245155
numbers in the key name "ehframe" or "eh_frame" in addition to the deprecated
"gcc" name (e.g. from a plugin.process.gdb-remote.target-definition-file
python file).
llvm-svn: 245151
When trying to fix SGPR live ranges, skip defs that are
killed in the same block as the def. I don't think
we need to worry about these cases as long as the
live ranges of the SGPRs in dominating blocks are
correct.
This reduces the number of elements the second
loop over the function needs to look at, and makes
it generally easier to understand. The second loop
also only considers if the live range is live
in to a block, which logically means it
must have been live out from another.
llvm-svn: 245150
Some personality routines require funclet exit points to be clearly
marked, this is done by producing a token at the funclet pad and
consuming it at the corresponding ret instruction. CleanupReturnInst
already had a spot for this operand but CatchReturnInst did not.
Other personality routines don't need to use this which is why it has
been made optional.
llvm-svn: 245149
function.
This was the same as getFrameIndexReference, but without the FrameReg
output.
Differential Revision: http://reviews.llvm.org/D12042
llvm-svn: 245148
This is just an initial checkin of an implementation of the Relooper algorithm, in preparation for WebAssembly codegen to utilize. It doesn't do anything yet by itself.
The Relooper algorithm takes an arbitrary control flow graph and generates structured control flow from that, utilizing a helper variable when necessary to handle irreducibility. The WebAssembly backend will be able to use this in order to generate an AST for its binary format.
Author: azakai
Reviewers: jfb, sunfish
Subscribers: jevinskie, arsenm, jroelofs, llvm-commits
Differential revision: http://reviews.llvm.org/D11691
llvm-svn: 245142
for eh_frame and stabs register numberings. This is not
complete but it's a step in the right direction. It's almost
entirely mechanical.
lldb informally uses "gcc register numbering" to mean eh_frame.
Why? Probably because there's a notorious bug with gcc on i386
darwin where the register numbers in eh_frame were incorrect.
In all other cases, eh_frame register numbering is identical to
dwarf.
lldb informally uses "gdb register numbering" to mean stabs.
There are no official definitions of stabs register numbers
for different architectures, so the implementations of gdb
and gcc are the de facto reference source.
There were some incorrect uses of these register number types
in lldb already. I fixed the ones that I saw as I made
this change.
This commit changes all references to "gcc" and "gdb" register
numbers in lldb to "eh_frame" and "stabs" to make it clear
what is actually being represented.
lldb cannot parse the stabs debug format, and given that no
one is using stabs any more, it is unlikely that it ever will.
A more comprehensive cleanup would remove the stabs register
numbers altogether - it's unnecessary cruft / complication to
all of our register structures.
In ProcessGDBRemote, when we get register definitions from
the gdb-remote stub, we expect to see "gcc:" (qRegisterInfo)
or "gcc_regnum" (qXfer:features:read: packet to get xml payload).
This patch changes ProcessGDBRemote to also accept "ehframe:"
and "ehframe_regnum" from these remotes.
I did not change GDBRemoteCommunicationServerLLGS or debugserver
to send these new packets. I don't know what kind of interoperability
constraints we might be working under. At some point in the future
we should transition to using the more descriptive names.
Throughout lldb we're still using enum names like "gcc_r0" and "gdb_r0",
for eh_frame and stabs register numberings. These should be cleaned
up eventually too.
The sources link cleanly on macosx native with xcode build. I
don't think we'll see problems on other platforms but please let
me know if I broke anyone.
llvm-svn: 245141
This patch makes the Merge Functions pass faster by calculating and comparing
a hash value which captures the essential structure of a function before
performing a full function comparison.
The hash is calculated by hashing the function signature, then walking the basic
blocks of the function in the same order as the main comparison function. The
opcode of each instruction is hashed in sequence, which means that different
functions according to the existing total order cannot have the same hash, as
the comparison requires the opcodes of the two functions to be the same order.
The hash function is a static member of the FunctionComparator class because it
is tightly coupled to the exact comparison function used. For example, functions
which are equivalent modulo a single variant callsite might be merged by a more
aggressive MergeFunctions, and the hash function would need to be insensitive to
these differences in order to exploit this.
The hashing function uses a utility class which accumulates the values into an
internal state using a standard bit-mixing function. Note that this is a different interface
than a regular hashing routine, because the values to be hashed are scattered
amongst the properties of a llvm::Function, not linear in memory. This scheme is
fast because only one word of state needs to be kept, and the mixing function is
a few instructions.
The main runOnModule function first computes the hash of each function, and only
further processes functions which do not have a unique function hash. The hash
is also used to order the sorted function set. If the hashes differ, their
values are used to order the functions, otherwise the full comparison is done.
Both of these are helpful in speeding up MergeFunctions. Together they result in
speedups of 9% for mysqld (a mostly C application with little redundancy), 46%
for libxul in Firefox, and 117% for Chromium. (These are all LTO builds.) In all
three cases, the new speed of MergeFunctions is about half that of the module
verifier, making it relatively inexpensive even for large LTO builds with
hundreds of thousands of functions. The same functions are merged, so this
change is free performance.
Author: jrkoenig
Reviewers: nlewycky, dschuff, jfb
Subscribers: llvm-commits, aemerson
Differential revision: http://reviews.llvm.org/D11923
llvm-svn: 245140
This enables Clang to correctly handle code such as:
struct __declspec(dllexport) S {
int x = 42;
};
where it would otherwise error due to trying to generate the default
constructor before the in-class initializer for x has been parsed.
Differential Revision: http://reviews.llvm.org/D11850
llvm-svn: 245139