This patch removes the std::string& argument from a number of C++ LTO API calls
and instead makes them use the installed diagnostic handler. This would also
improve consistency of diagnostic handling infrastructure: if an LTO client used
lto_codegen_set_diagnostic_handler() to install a custom error handler, we do
not want some error messages to go through the custom error handler, and some
other error messages to go into sLastErrorString.
llvm-svn: 253367
While setting function attributes we check all instructions that may access memory. For a call instruction we check all arguments. The special check is required for pointers.
I added vector-of-pointers to the call arguments types that should be checked.
Differential Revision: http://reviews.llvm.org/D14693
llvm-svn: 253363
The underlying issues surrounding codegen for 32-bit vselects have been resolved. The pessimistic costs for 64-bit vselects remain due to the bad
scalarization that is still happening there.
I tested this on A57 in T32, A32 and A64 modes. I saw no regressions, and some improvements.
From my benchmarks, I saw these improvements in A57 (T32)
spec.cpu2000.ref.177_mesa 5.95%
lnt.SingleSource/Benchmarks/Shootout/strcat 12.93%
lnt.MultiSource/Benchmarks/MiBench/telecomm-CRC32/telecomm-CRC32 11.89%
I also measured A57 A32, A53 T32 and A9 T32 and found no performance regressions. I see much bigger wins in third-party benchmarks with this change
Differential Revision: http://reviews.llvm.org/D14743
llvm-svn: 253349
Summary:
This patch changes the behavior of path::system_temp_directory() on Windows to be closer to GetTempPath Windows API call. Enforces path separator to be the native one, makes path absolute, etc. GetTempPath is not used directly because of limitations/implementation bugs on Windows 7.
Windows specific unit tests are added. Most of them runs in separated process with modified environment variables.
This change fixes FileSystemTest.CreateDir unittest that had been failing when run from Unix-like shell on Windows (Unix-like path separator (/) used in env variables).
Reviewers: chapuni, rafael, aaron.ballman
Subscribers: rafael, llvm-commits
Differential Revision: http://reviews.llvm.org/D14231
llvm-svn: 253345
SELECT_CC has the nasty property of having operands with unrelated
types. So if you do something like:
f32 = select_cc f16, f16, f32, f32, cc
You'd only look for the action for <select_cc, f32>, but never f16.
If the types are all legal, but the op isn't (as for f16 on AArch64,
or for f128 on x86_64/AArch64?), then you get into trouble.
For f128, we have softenSetCCOperands to handle this case.
Similarly, for f16, we can directly promote the CC operands.
llvm-svn: 253344
Now that setExecutable() changed to do all the ground work to make
memory executable on the host, we can remove all (redundant) calls
to invalidate instruction cache here.
As an added bonus, this makes invalidateInstructionCache() dead
code, so it can be removed.
Differential Revision: http://reviews.llvm.org/D13631
llvm-svn: 253343
setExecutable() should do everything that's needed to make the memory
executable on host, i.e. unconditionally set permissions + invalidate
instruction cache. llvm-rtdyld will be updated in my next commit.
Discusseed with: Lang Hames (as part of D13631).
llvm-svn: 253341
Statepoint lowering currently expects that the target method of a
statepoint only defines a single value. This precludes using
statepoints with ABIs that return values in multiple registers
(e.g. the SysV AMD64 ABI). This change adds support for lowering
statepoints with mutli-def targets.
llvm-svn: 253339
Several places in AsmPrinter.cpp print comments describing MachineOperand
registers using MCRegisterInfo, which uses MCOperand-oriented names. This
doesn't work for targets that use virtual registers exclusively, as
WebAssembly does, since virtual registers are represented and printed
differently.
This patch preserves what seems to be the spirit of r229978, avoiding the
use of TM.getSubtargetImpl(), while still using MachineOperand-oriented
printing for MachineOperands.
Differential Revision: http://reviews.llvm.org/D14709
llvm-svn: 253338
Currently, if the assembler encounters an error after parsing (such as an
out-of-range fixup), it reports this as a fatal error, and so stops after the
first error. However, for most of these there is an obvious way to recover
after emitting the error, such as emitting the fixup with a value of zero. This
means that we can report on all of the errors in a file, not just the first
one. MCContext::reportError records the fact that an error was encountered, so
we won't actually emit an object file with the incorrect contents.
Differential Revision: http://reviews.llvm.org/D14717
llvm-svn: 253328
This adds reportError to MCContext, which can be used as an alternative to
reportFatalError when the assembler wants to try to continue processing the
rest of the file after the error is reported, so that all of the errors ina
file can be reported. It records the fact that an error was encountered, so we
can avoid emitting an object file if any errors occurred.
This patch doesn't add any uses of this function (a later patch will convert
most uses of reportFatalError to use it), but there is a small functional
change: we use the SourceManager to print the error message, even if we have a
null SMLoc. This means that we get a SourceManager-style message, with the file
and line information shown as <unknown>, rather than the "LLVM ERROR" style
used by report_fatal_error.
llvm-svn: 253327
Indexed profile data as designed today does not guarantee
counter data to be well aligned, so reading needs to use
the slower form (with memcpy). This is less than ideal and
should be improved in the future (i.e., with fixed length
function key instead of variable length name key).
llvm-svn: 253309
The way prelink used to work was
* The compiler decides if a given section only has relocations that
are know to point to the same DSO. If so, it names it
.data.rel.ro.local<something>.
* The static linker puts all of these together.
* The prelinker program assigns addresses to each library and resolves
the local relocations.
There are many problems with this:
* It is incompatible with address space randomization.
* The information passed by the compiler is redundant. The linker
knows if a given relocation is in the same DSO or not. If could sort
by that if so desired.
* There are newer ways of speeding up DSO (gnu hash for example).
* Even if we want to implement this again in the compiler, the previous
implementation is pretty broken. It talks about relocations that are
"resolved by the static linker". If they are resolved, there are none
left for the prelinker. What one needs to track is if an expression
will require only dynamic relocations that point to the same DSO.
At this point it looks like the prelinker is an historical curiosity.
For example, fedora has retired it because it failed to build for two
releases
(http://pkgs.fedoraproject.org/cgit/prelink.git/commit/?id=eb43100a8331d91c801ee3dcdb0a0bb9babfdc1f)
This patch removes support for it. That is, it stops printing the
".local" sections.
llvm-svn: 253280
Allowing imprecise lane masks in case of more than 32 sub register lanes
lead to some tricky corner cases, and I need another bugfix for another
one. Instead I rather declare lane masks as precise and let tablegen
abort if we do not have enough bits.
This does not affect any in-tree target, even AMDGPU only needs 16 lanes
at the moment. If the 32 lanes turn out to be a problem in the future,
then we can easily change the LaneBitmask typedef to uint64_t.
Differential Revision: http://reviews.llvm.org/D14557
llvm-svn: 253279
This was regressed in r252656 which wasn't quite NFC. Instead of using a
custom instruction as before, use a pattern to select CONST_I32 for the
global addrs.
Differential Revision: http://reviews.llvm.org/D14587
llvm-svn: 253276
Useful utility function; this wasn't too hard to do before, but also wasn't
obviously discoverable. Make it explicit. Reviewed offline by Michael
Gottesman.
llvm-svn: 253254
In r253126 we stopped to recompute LCSSA after loop unrolling in all
cases, except the unrolling is full and at least one of the loop exits
is outside the parent loop. In other cases the transformation should not
break LCSSA, but it turned out, that we also call SimplifyLoop on the
parent loop, which might break LCSSA by itself. This fix just triggers
LCSSA recomputation in this case as well.
I'm committing it without a test case for now, but I'll try to invent
one. It's a bit tricky because in an isolated test LoopSimplify would
be scheduled before LoopUnroll, and thus will change the test and hide
the problem.
llvm-svn: 253253
Summary:
Previously return type information for a function was derived from
return dag nodes. But this didn't work for dags with != return node. So
instead compute it directly from the LLVM function as is done for imports.
Differential Revision: http://reviews.llvm.org/D14593
llvm-svn: 253251
Summary: This is to match the new version in the spec
Reviewers: sunfish
Subscribers: jfb, llvm-commits, dschuff
Differential Revision: http://reviews.llvm.org/D14519
llvm-svn: 253249
On top of that, don't bother allocating and initializing UnwindHelp if
we don't have any funclets. Currently we always use RBP as our frame
pointer when funclets are present, so this change makes it impossible to
come here without any fixed stack objects.
Fixes PR25533.
llvm-svn: 253245
We sometimes create intermediate subtract instructions during
reassociation. Adding these to the worklist to revisit exposes many
additional reassociation opportunities.
Patch by Aditya Nandakumar.
llvm-svn: 253240
We tried to move the insertion point beyond instructions like landingpad
and cleanuppad.
However, we *also* tried to move past catchpad. This is problematic
because catchpad is also a terminator.
This fixes PR25541.
llvm-svn: 253238
Function ARMConstantIslands::doInitialJumpTablePlacement() iterates over all
basic blocks in a machine function. It calls `MI = MBB.getLastNonDebugInstr()`
to get the last instruction in each block and then uses MI->getOpcode() to
decide what to do. If getLastNonDebugInstr() returns MBB.end() (for example,
when the block does not contain any instructions) then calling getOpcode() on
this value is incorrect. Avoid this problem by checking the result of
getLastNonDebugInstr().
Differential Revision: http://reviews.llvm.org/D14694
llvm-svn: 253222
Storing the source location of the expression that created a constant pool
entry allows us to emit better error messages if we later discover that the
expression cannot be represented by a relocation.
Differential Revision: http://reviews.llvm.org/D14646
llvm-svn: 253220
The MCValue class can store a SMLoc to allow better error messages to be
emitted if an error is detected after parsing. The ARM and AArch64 assembly
parsers were not setting this, so error messages did not have source
information.
Differential Revision: http://reviews.llvm.org/D14645
llvm-svn: 253219
Summary:
* ARMv6KZ is the "canonical" name, given in the ARMARM
* ARMv6Z is an "official abbreviation" for it, mentioned in the ARMARM
* ARMv6ZK is a popular misspelling, which we should support as an alias.
The patch corrects the handling of the names.
Functional changes:
* ARMv6Z no longer treated as an architecture in its own right
* ARMv6ZK renamed to ARMv6KZ, accepting ARMv6ZK as an alias
* arm1176jz-s and arm1176jzf-s recognized as ARMv6ZK, instead of ARMv6K
* default ARMv6K CPU changed to arm1176j-s
Reviewers: rengolin, logan, compnerd
Subscribers: aemerson, llvm-commits, rengolin
Differential Revision: http://reviews.llvm.org/D14568
llvm-svn: 253206
Summary:
* declare FPUNames, ARCHNames, ARCHExtNames, HWDivNames, CPUNames
as static const
* implement getDefaultExtensions with a StringSwitch, in the same
way getDefaultFPU is implemented
Reviewers: rengolin
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D14648
llvm-svn: 253201
Instead of defaulting to an empty string, we want to default to
the CPU 'generic' in the case of no valid default CPU being found,
(as long as the architecture is actually valid).
In order to do this we add a default FPU for each architecture, as
well as falling back to architecture defaults for extensions and FPU
in the case of a generic CPU is specified.
llvm-svn: 253198
This allows for accurate architecture targeting as well as removing
duplicate information (hardcoded feature strings) from MCTargetDesc.
llvm-svn: 253196
This was left implicit and never ever checked, which means we could have a CMPZ against some non-zero value and we were carrying on with BFI conversion regardless.
Caught by Oliver Stannard using csmith; regression test added.
llvm-svn: 253195
Summary:
This fails a check in Verifier.cpp, which checks for location matches between the declared
variable and the !dbg attachments.
Reviewers: dnovillo, dblaikie, danielcdh
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D14657
llvm-svn: 253194
The AArch64 assembler was silently ignoring instructions like this:
ldr foo, =bar
AArch64AsmParser::parseOperand was returning true as the parse failed, but was
not calling AArch64AsmParser::Error to report this to the user, so the
instruction was ignored without printing an error message.
Differential Revision: http://reviews.llvm.org/D14651
llvm-svn: 253193
Address Duncan Exon Smith's comments on D14148, which was added after the patch had been LGTM'd and committed:
* clang-format one area where whitespace diffs occurred.
* Add a threshold to limit the store/load dominance checks as they are quadratic.
llvm-svn: 253192
Summary: Since we're passing references to dbg.value as pointers,
we need to have the frontend properly declare their sizes and
alignments (as it already does for regular pointers) in preparation
for my upcoming patch to have the verifer check that the sizes agree.
Also augment the backend logic that skips actually emitting this
information into DWARF such that it also handles reference types.
Reviewers: aprantl, dexonsmith, dblaikie
Subscribers: dblaikie, llvm-commits
Differential Revision: http://reviews.llvm.org/D14275
llvm-svn: 253186
Summary: The Old personality function gets copied over, but the
Materializer didn't have a chance to inspect it (e.g. to fix up
references to the correct module for the target function).
Also add a verifier check that makes sure the personality routine
is in the same module as the function whose personality it is.
Reviewers: majnemer
Subscribers: jevinskie, llvm-commits
Differential Revision: http://reviews.llvm.org/D14474
llvm-svn: 253183
Summary: Moving landingpads into successor basic blocks makes the
verifier sad. Teach Sink that much like PHI nodes and terminator
instructions, landingpads (and cleanuppads, etc.) may not be moved
between basic blocks.
Reviewers: majnemer
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D14475
llvm-svn: 253182
Summary:
The patch to move metadata linking after global value linking didn't
correctly map unmaterialized global values to null as desired. They
were in fact mapped to the source copy. It largely worked by accident
since most module linker clients destroyed the source module which
caused the source GVs to be replaced by null, but caused a failure with
LTO linking on Windows:
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312869.html
The problem is that a null return value from materializeValueFor is
handled by mapping the value to self. This is the desired behavior when
materializeValueFor is passed a non-GlobalValue. The problem is how to
distinguish that case from the case where we really do want to map to
null.
This patch addresses this by passing in a new flag to the value mapper
indicating that unmapped global values should be mapped to null. Other
Value types are handled as before.
Note that the documented behavior of asserting on unmapped values when
the flag RF_IgnoreMissingValues isn't set is currently disabled with
FIXME notes due to bootstrap failures. I modified these disabled asserts
so when they are eventually enabled again it won't assert for the
unmapped values when the new RF_NullMapMissingGlobalValues flag is set.
I also considered using a callback into the value materializer, but a
flag seemed cleaner given that there are already existing flags.
I also considered modifying materializeValueFor to return the input
value when we want to map to source and then treat a null return
to mean map to null. However, there are other value materializer
subclasses that implement materializeValueFor, and they would all need
to be audited and the return values possibly changed, which seemed
error-prone.
Reviewers: dexonsmith, joker.eph
Subscribers: pcc, llvm-commits
Differential Revision: http://reviews.llvm.org/D14682
llvm-svn: 253170
Global to local demotion can speed up programs that use globals a lot. It is particularly useful with LTO, when the entire call graph is known and most functions have been internalized.
For a global to be demoted, it must only be accessed by one function and that function:
1. Must never recurse directly or indirectly, else the GV would be clobbered.
2. Must never rely on the value in GV at the start of the function (apart from the initializer).
GlobalOpt can already do this, but it is hamstrung and only ever tries to demote globals inside "main", because C++ gives extra guarantees about how main is called - once and only once.
In LTO mode, we can often prove the first property (if the function is internal by this point, we know enough about the callgraph to determine if it could possibly recurse). FunctionAttrs now infers the "norecurse" attribute for this reason.
The second property can be proven for a subset of functions by proving that all loads from GV are dominated by a store to GV. This is conservative in the name of compile time - this only requires a DominatorTree which is fairly cheap in the grand scheme of things. We could do more fancy stuff with MemoryDependenceAnalysis too to catch more cases but this appears to catch most of the useful ones in my testing.
llvm-svn: 253168
The current implementation of GEP visitor in InstCombine fails with assertion on Vector GEP with mix of scalar and vector types, like this:
getelementptr double, double* %a, <8 x i32> %i
(It fails to create a "sext" from <8 x i32> to <8 x i64>)
I fixed it and added some tests.
Differential Revision: http://reviews.llvm.org/D14485
llvm-svn: 253162
Summary:
Document the differences between the isKnownWindowsMSVC() and isWindowsMSVC() methods on Triple.
Also removed the '\brief' Doxygen annotations - now that 'AUTOBRIEF' is set to on, they are unnecessary.
Subscribers: jfb, tberghammer, danalbert, srhines, dschuff
Differential Revision: http://reviews.llvm.org/D14110
llvm-svn: 253159
Summary:
There are currently two blocks with the METADATA_BLOCK id at module
scope. The first has the module-level metadata values (consisting of
some combination of METADATA_* record codes except for METADATA_KIND).
The second consists only of METADATA_KIND records. The latter is used
only in the METADATA_ATTACHMENT block within function blocks (for
metadata attached to instructions).
For ThinLTO we want to delay the parsing of module level metadata
until all functions have been imported from that module (there is some
bookkeeping used to suture it up when we read it during a post-pass).
However, we do need the METADATA_KIND records when parsing the function
body during importing, since those kinds are used as described above.
To simplify identification and parsing of just the block containing
the metadata kinds, use a different block id (METADATA_KIND_BLOCK_ID).
Support older bitcode without the new block id as well.
Reviewers: dexonsmith, joker.eph
Subscribers: davidxl, llvm-commits
Differential Revision: http://reviews.llvm.org/D14654
llvm-svn: 253154
MCRelaxableFragment previously kept a copy of MCSubtargetInfo and
MCInst to enable re-encoding the MCInst later during relaxation. A copy
of MCSubtargetInfo (instead of a reference or pointer) was needed
because the feature bits could be modified by the parser.
This commit replaces the MCSubtargetInfo copy in MCRelaxableFragment
with a constant reference to MCSubtargetInfo. The copies of
MCSubtargetInfo are kept in MCContext, and the target parsers are now
responsible for asking MCContext to provide a copy whenever the feature
bits of MCSubtargetInfo have to be toggled.
With this patch, I saw a 4% reduction in peak memory usage when I
compiled verify-uselistorder.lto.bc using llc.
rdar://problem/21736951
Differential Revision: http://reviews.llvm.org/D14346
llvm-svn: 253127