We were previously codegen'ing memcpy as regular load/store operations and
hoping that the register allocator would allocate registers in ascending order
so that we could apply an LDM/STM combine after register allocation. According
to the commit that first introduced this code (r37179), we planned to teach the
register allocator to allocate the registers in ascending order. This never got
implemented, and up to now we've been stuck with very poor codegen.
A much simpler approach for achieving better codegen is to create MEMCPY pseudo
instructions, attach scratch virtual registers to them and then, post register
allocation, expand the MEMCPYs into LDM/STM pairs using the scratch registers.
The register allocator will have picked arbitrary registers which we sort when
expanding the MEMCPY. This approach also avoids the need to repeatedly calculate
offsets which ultimately ought to be eliminated pre-RA in order to decrease
register pressure.
Fixes PR9199 and PR23768.
[This is based on Peter Collingbourne's r238473 which was reverted.]
Differential Revision: http://reviews.llvm.org/D13239
Change-Id: I727543c2e94136e0f80b8e22d5642d7b9ee5b458
Author: Peter Collingbourne <peter@pcc.me.uk>
llvm-svn: 249322
For RealFileSystem this is getcwd()/chdir(), the synthetic file systems can
make up one for themselves. OverlayFileSystem now synchronizes the working
directories when a new FS is added to the overlay or the overlay working
directory is set. This allows purely artificial file systems that have zero
ties to the underlying disks.
Differential Revision: http://reviews.llvm.org/D13430
llvm-svn: 249316
This is a simple file system tree of memory buffers that can be filled by a
client. In conjunction with an OverlayFS it can be used to make virtual
files accessible right next to physical files. This can be used as a
replacement for the virtual file handling in FileManager and which I intend
to remove eventually.
llvm-svn: 249315
"msr pan, #imm", while only 1-bit immediate values should be valid.
Changed encoding and decoding for msr pstate instructions.
Differential Revision: http://reviews.llvm.org/D13011
llvm-svn: 249313
Summary:
An instruction like "(d)la $5, symbol+8" previously would have crashed the
assembler as it contains an expression. This is now fixed.
A few tests cases have also been changed to reflect these changes, however
these should only be syntax changes. Some new test cases have also been
added.
Patch by Scott Egerton.
Reviewers: vkalintiris, dsanders
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D12760
llvm-svn: 249311
Summary: Add the second template argument to the unique_ptr mock, and update the matcher so that it only matches against cases where the second argument is the default.
Reviewers: klimek
Subscribers: alexfh, cfe-commits
Differential Revision: http://reviews.llvm.org/D13433
llvm-svn: 249305
This extends the work done in r233995 so that now getFragment (in addition to
getSection) also works for variable symbols.
With that the existing logic to decide if a-b can be computed works even if
a or b are variables. Given that, the expression evaluation can avoid expanding
variables as aggressively and that in turn lets the relocation code see the
original variable.
In order for this to work with the asm streamer, there is now a dummy fragment
per section. It is used to assign a section to a symbol when no other fragment
exists.
This patch is a joint work by Maxim Ostapenko andy myself.
llvm-svn: 249303
OpenCL v1.1 s6.2.2: for the boolean value true, every bit in the result vector should be set.
This change treats the i1 value as signed for the purposes of performing the cast to integer,
and therefore sign extend into the result.
Patch by Neil Hickey!
http://reviews.llvm.org/D13349
llvm-svn: 249301
Summary: Now that we prioritize copying trivial types over using const-references where possible, I found some cases where, after the transformation, the loop was using the address of the local copy instead of the original object.
Reviewers: klimek
Subscribers: alexfh, cfe-commits
Differential Revision: http://reviews.llvm.org/D13431
llvm-svn: 249300
The entries are added if there are "_init" or "_fini" entries in
the symbol table respectively. According to the behavior of ld,
entries are inserted even for undefined symbols.
Symbol names can be overridden by using -init and -fini command
line switches. If used, these switches neither add new symbol table
entries nor require those symbols to be resolved.
Differential Revision: http://reviews.llvm.org/D13385
llvm-svn: 249297
Add symbol specified with -u as undefined which may cause additional
object files from archives to be linked into the resulting binary.
Differential Revision: http://reviews.llvm.org/D13345
llvm-svn: 249295
r249137 added support for the new mips-mti-linux toolchain. However,
the new tests of that commit, broke some buildbots because they didn't use
the correct regular expressions to capture the filename of Clang & LLD.
This commit re-applies the changes of r249137 and fixes the tests in
r249137 in order to match the filenames of the Clang and LLD executable.
llvm-svn: 249294
Summary:
https://llvm.org/bugs/show_bug.cgi?id=24960
modernize-use-nullptr would hit an assertion in some cases involving macros and initializer lists, due to finding a node with more than one parent (the two forms of the initializer list).
However, this doesn't mean that the replacement is incorrect, so instead of just rejecting this case I tried to find a way to make it work. Looking at the semantic form of the InitListExpr made sense to me (looking at both forms results in false negatives) but I am not sure of the things that we can miss by skipping the syntactic form.
Reviewers: klimek
Subscribers: cfe-commits, alexfh
Differential Revision: http://reviews.llvm.org/D13246
llvm-svn: 249291
statement. Specifically, we don't want people that have already written
a patch to just write a style guide for their 1-person project.
llvm-svn: 249290
Summary: When LLVM/Clang is built without ARM support, the ios_kext runtime
library is not built, but without this patch, the Makefile still tries to
copy it. This is a recent regression, because the ios_kext library used
to also be built on x86_64.
Reviewers: beanz
Subscribers: aemerson, cfe-commits, rengolin
Differential Revision: http://reviews.llvm.org/D13421
llvm-svn: 249281
In versions of clang prior to r238238, __declspec was recognized as a keyword in
all modes. It was then changed to only be enabled when Microsoft or Borland
extensions were enabled (and for CUDA, as a temporary measure). There is a
desire to support __declspec in Playstation code, and possibly other
environments. This commit adds a command-line switch to allow explicit
enabling/disabling of the recognition of __declspec as a keyword. Recognition
is enabled by default in Microsoft, Borland, CUDA, and PS4 environments, and
disabled in all other environments.
Patch by Warren Ristow!
llvm-svn: 249279
A statement with an empty domain complicates the invariant load
hoisting and does not help any subsequent analysis or transformation.
In fact it might introduce parameter dimensions or increase the
schedule dimensionality. To this end, we remove statements with an
empty domain early in the SCoP simplification.
llvm-svn: 249276
Before isValidCFG() could hide the fact that a loop is non-affine by
over-approximation. This is problematic if a subregion of the loop contains
an exit/latch block and is over-approximated. Now we do not over-approximate
in the isValidCFG function if we check loop control. If such control is
non-affine the whole loop is over-approximated, not only a subregion.
llvm-svn: 249273