Host-only cuda compilation does produce valid host object
file and in some cases users do want to proceed on to the linking phase.
The change removes special case that stopped compilation pipeline at
the Assembly phase. Device-side compilation is still stopped early
by the types::getCompilationPhases().
Differential Revision: http://reviews.llvm.org/D11573
llvm-svn: 243478
Store the locations for a macro expansion in a vector, then iterate over them
instead of using recursion. This simplifies the logic around the backtrace
limit and gives easier access to the source locations. No functionality change.
Patch by Zhengkai Wu.
Differential Revision: http://reviews.llvm.org/D11542
llvm-svn: 243477
Rename getBinaryBasename() to getProcessName() and, on Linux,
read it from /proc/self/cmdline instead of /proc/self/exe. The former
can be modified by the process. The main motivation is Android, where
application processes re-write cmdline to a package name. This lets
us setup per-application ASAN_OPTIONS through include=/some/path/%b.
llvm-svn: 243473
Summary:
Generate correct code for the select instruction by zero-extending
it's boolean/condition operand to GPR-width. This is necessary because
the conditional-move instructions operate on the whole register.
Reviewers: dsanders
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D11506
llvm-svn: 243469
The removal of in-process Linux debug support left a switch statement
with llvm::Triple::FreeBSD as the only case. Simplify by replacing it
with a now-equivalent assertion.
llvm-svn: 243468
callee-saved return address is stored in the caller's stack frame, not
the callee's. This patch adjusts the logic to find the LR in the
correct place for PowerPC.
Patch joint with Bill Seurer.
llvm-svn: 243467
error.
If the object being moved has a move constructor and a deleted copy constructor,
std::move is required, otherwise Clang will give a deleted constructor error.
llvm-svn: 243463
If the pointer is the store's value operand, this would produce
a broken module. Make sure the use is actually for the pointer operand.
llvm-svn: 243462
Summary:
Make Scalar Evolution able to propagate NSW and NUW flags from instructions to SCEVs in some cases. This is based on reasoning about when poison from instructions with these flags would trigger undefined behavior. This gives a 13% speed-up on some Eigen3-based Google-internal microbenchmarks for NVPTX.
There does not seem to be clear agreement about when poison should be considered to propagate through instructions. In this analysis, poison propagates only in cases where that should be uncontroversial.
This change makes LSR able to create induction variables for expressions like &ptr[i + offset] for loops like this:
for (int i = 0; i < limit; ++i) {
sum += ptr[i + offset];
}
Here ptr is a 64 bit pointer and offset is a 32 bit integer. For NVPTX, LSR currently creates an induction variable for i + offset instead, which is not as fast. Improving this situation is what brings the 13% speed-up on some Eigen3-based Google-internal microbenchmarks for NVPTX.
There are more details in this discussion on llvmdev.
June: http://lists.cs.uiuc.edu/pipermail/llvmdev/2015-June/thread.html#87234
July: http://lists.cs.uiuc.edu/pipermail/llvmdev/2015-July/thread.html#87392
Patch by Bjarke Roune
Reviewers: eliben, atrick, sanjoy
Subscribers: majnemer, hfinkel, jingyue, meheff, llvm-commits
Differential Revision: http://reviews.llvm.org/D11212
llvm-svn: 243460
Schedule trees are a lot easier to work with, for both humans and machines. For
humans the more structured schedule representation is easier to reason about.
Together with the more abstract isl programming interface this can result in a
lot cleaner code (see this changeset). For machines, the structured schedule and
the fact that we now use explicit piecewise affine expressions instead of
integer maps makes it easier to generate code from this schedule tree. As a
result, we can already see a slight compile-time improvement -- for 3mm from
0m0.593s to 0m0.551s seconds (-7 %). More importantly, future optimizations such
as full-partial tile separation will most likely result in more streamlined code
to be generated.
Contributed-by: Roman Gareev <gareevroman@gmail.com>
llvm-svn: 243458
GR64 <-> VR64 copies).
This commit adds a MIR test case for the commit r242191, which was committed
without one. This test case verifies that the ExpandPostRA pass expands the
GR64 <-> VR64 copies into the appropriate MMX_MOV instructions.
llvm-svn: 243457
Summary: MCAsmInfo is set up with the default AssemblerDialect, which is zero.
Subscribers: llvm-commits, sunfish, jfb
Differential Revision: http://reviews.llvm.org/D11567
llvm-svn: 243452
This commit extracts the code that parses a global value from the method
'parseGlobalAddressOperand' into a new method 'parseGlobalValue', so that this
code can be reused by the method which will parse the block address machine
operands.
llvm-svn: 243450
This commit moves the function 'lexName' to the start of the file so it can
be reused by the function which will lex the named LLVM IR block references.
llvm-svn: 243449
This commit removes an outdated TODO comment and a corresponding assertion
which asserts that the mir printer can't the print machine basic blocks that
aren't sequentially numbered.
This comment and assertion were correct when I was working on the patch which
serialized the machine basic blocks, but then I decided to add an 'ID'
attribute to the machine basic block's YAML mapping based on the patch review.
This comment and assertion then became invalid as with the 'ID' attribute we
can serialize the non sequential machine basic blocks and their references
without any problems.
llvm-svn: 243447
This applies default compiler flags to .S files, in particular removing
the "-pedantic" option, which is desirable because there is nothing to
reasonably warn about; and the only thing that gcc warns about is that
you allegedly can't correctly invoke GLUE2 in lib/builtins/assembly.h
on platforms for which USER_LABEL_PREFIX is the empty string.
In the gcc bug https://gcc.gnu.org/bugzilla/show_bug.cgi?id=33305 that
added the warning, a commenter notes that giving a macro of zero characters
to another macro is not precisely the same as failing to supply an argument,
and "there is a widespread belief in C++ community that such usage is valid".
Unfortunately the only way to silence the warning is to avoid -pedantic.
Differential Revision: http://reviews.llvm.org/D10713
llvm-svn: 243446
This commit removes the redundant parameters from the two methods
'initializeRegisterInfo' and 'initializeFrameInfo'. The removed parameters are
redundant as we are already passing in the 'MachineFunction' to those methods,
and those parameters can be derived from the machine function parameter.
llvm-svn: 243445
(Keep -Wmsvc-include around as an alias.)
While here, also replace the one other mention of "MSVC" in diagnostics with
"Microsoft", for consistency.
llvm-svn: 243444
This will be used for old targets like Android that do not
support ELF TLS models.
Differential Revision: http://reviews.llvm.org/D10524
llvm-svn: 243441
The 'common' section TLS is not implemented.
Current C/C++ TLS variables are not placed in common section.
DWARF debug info to get the address of TLS variables is not generated yet.
clang and driver changes in http://reviews.llvm.org/D10524
Added -femulated-tls flag to select the emulated TLS model,
which will be used for old targets like Android that do not
support ELF TLS models.
Added TargetLowering::LowerToTLSEmulatedModel as a target-independent
function to convert a SDNode of TLS variable address to a function call
to __emutls_get_address.
Added into lib/Target/*/*ISelLowering.cpp to call LowerToTLSEmulatedModel
for TLSModel::Emulated. Although all targets supporting ELF TLS models are
enhanced, emulated TLS model has been tested only for Android ELF targets.
Modified AsmPrinter.cpp to print the emutls_v.* and emutls_t.* variables for
emulated TLS variables.
Modified DwarfCompileUnit.cpp to skip some DIE for emulated TLS variabls.
TODO: Add proper DIE for emulated TLS variables.
Added new unit tests with emulated TLS.
Differential Revision: http://reviews.llvm.org/D10522
llvm-svn: 243438
Object: add IMAGE_FILE_MACHINE_ARM64
The official specifications state that the value of IMAGE_FILE_MACHINE_ARM64
is 0xAA64 (as per the Microsoft Portable Executable and Common Object Format
Specification v8.3).
Reviewers: rnk
Subscribers: llvm-commits, compnerd, ruiu
Differential Revision: http://reviews.llvm.org/D11511
llvm-svn: 243434
As of r240543 ProcessPOSIX and POSIXThread are used only on FreeBSD, so
just roll them into ProcessFreeBSD and FreeBSDThread.
Differential Revision: http://reviews.llvm.org/D10698
llvm-svn: 243427
- Use cached LLVM types
- Turn SmallVectors into Arrays/ArrayRef if the size is static
- Use ConstantInt::get's implicit splatting for vector types
No functionality change intended.
llvm-svn: 243425