If an inline function is observed but unused in a translation unit, dummy
coverage mapping data with zero hash is stored for this function.
If such a coverage mapping section came earlier than real one, the latter
was ignored. As a result, llvm-cov was unable to show coverage information
for those functions.
Differential Revision: http://reviews.llvm.org/D20286
llvm-svn: 270194
The ``nprocs`` command does not exist under Mac OSX so use
``sysctl`` instead on that platform.
Whilst I'm here
* Use ``pclose()`` instead of ``fclose()`` which the ``popen()``
documentation says should be used.
* Check for errors that were previously unhandled.
Differential Revision: http://reviews.llvm.org/D20409
llvm-svn: 270172
an instruction.
Use the previously introduced RepairingPlacement class to split the code
computing the repairing placement from the code doing the actual
placement. That way, we will be able to consider different placement and
then, only apply the best one.
llvm-svn: 270168
When assigning the register banks we may have to insert repairing code
to move already assigned values accross register banks.
Introduce a few helper classes to keep track of what is involved in the
repairing of an operand:
- InsertPoint and its derived classes record the positions, in the CFG,
where repairing has to be inserted.
- RepairingPlacement holds all the insert points for the repairing of an
operand plus the kind of action that is required to do the repairing.
This is going to be used to keep track of how the repairing should be
done, while comparing different solutions for an instruction. Indeed, we
will need the repairing placement to capture the cost of a solution and
we do not want to compute it a second time when we do the actual
repairing.
llvm-svn: 270167
register bank twice.
Prior to this change, we were checking if the assignment for the current
machine operand was matching, then we would check if the mismatch
requires to insert repair code.
We actually already have this information from the first check, so just
pass it along.
NFCI.
llvm-svn: 270166
Sorry for the lack testcase. There is one in the pr, but it depends on
std::sort and the .ll version is 110 lines, so I don't think it is
wort it.
The bug was that we were sorting after adding a terminator, and the
sorting algorithm could end up putting the terminator in the middle of
the List vector.
With that we would create a Spans map entry keyed on nullptr which would
then be added to CUs and fail in that sorting.
llvm-svn: 270165
This helper class will be used to represent the cost of mapping an
instruction to a specific register bank.
The particularity of these costs is that they are mostly local, thus the
frequency of the basic block is irrelevant. However, for few
instructions (e.g., phis and terminators), the cost may be non-local and
then, we need to account for the frequency of the involved basic blocks.
This will be used by the greedy mode I am working on.
llvm-svn: 270163
Before r257832, the threshold used by SimpleInliner was explicitly specified or generated from opt levels and passed to the base class Inliner's constructor. There, it was first overridden by explicitly specified -inline-threshold. The refactoring in r257832 did not preserve this behavior for all opt levels. This change brings back the original behavior.
Differential Revision: http://reviews.llvm.org/D20452
llvm-svn: 270153
Sequences of range checks expressed using guards, like
guard((I - 2) u< L)
guard((I - 1) u< L)
guard((I + 0) u< L)
guard((I + 1) u< L)
guard((I + 2) u< L)
can sometimes be combined into a smaller sequence:
guard((I - 2) u< L AND (I + 2) u< L)
if we can prove that (I - 2) u< L AND (I + 2) u< L implies all of checks
expressed in the previous sequence.
This change teaches GuardWidening to do this kind of merging when
feasible.
llvm-svn: 270151
Using Chandler's words from r265331:
This commit was greatly exacerbating PR17409 and effectively regressed
build time for lot of (very large) code when compiled with ASan or MSan.
PR17409 is fixed by r269249, so this is fine to reapply r263460.
Original commit message:
The bad behavior happens when we have a function with a long linear
chain of basic blocks, and have a live range spanning most of this
chain, but with very few uses.
Let say we have only 2 uses.
The Hopfield network is only seeded with two active blocks where the
uses are, and each iteration of the outer loop in
`RAGreedy::growRegion()` only adds two new nodes to the network due to
the completely linear shape of the CFG. Meanwhile,
`SpillPlacer->iterate()` visits the whole set of discovered nodes, which
adds up to a quadratic algorithm.
This is an historical accident effect from r129188.
When the Hopfield network is expanding, most of the action is happening
on the frontier where new nodes are being added. The internal nodes in
the network are not likely to be flip-flopping much, or they will at
least settle down very quickly. This means that while
`SpillPlacer->iterate()` is recomputing all the nodes in the network, it
is probably only the two frontier nodes that are changing their output.
Instead of recomputing the whole network on each iteration, we can
maintain a SparseSet of nodes that need to be updated:
- `SpillPlacement::activate()` adds the node to the todo list.
- When a node changes value (i.e., `update()` returns true), its
neighbors are added to the todo list.
- `SpillPlacement::iterate()` only updates the nodes in the list.
The result of Hopfield iterations is not necessarily exact. It should
converge to a local minimum, but there is no guarantee that it will find
a global minimum. It is possible that updating nodes in a different
order will cause us to switch to a different local minimum. In other
words, this is not NFC, but although I saw a few runtime improvements
and regressions when I benchmarked this change, those were side effects
and actually the performance change is in the noise as expected.
Huge thanks to Jakob Stoklund Olesen <stoklund@2pi.dk> for his
feedbacks, guidance and time for the review.
llvm-svn: 270149
Work around crashes in ``__sanitizer_malloc_hook()`` under Mac OSX.
Under Mac OSX we intercept calls to malloc before thread local
storage is initialised leading to a crash when accessing
``AllocTracer``. To workaround this ``AllocTracer`` is only accessed
in the hook under Linux. For symmetry ``__sanitizer_free_hook()``
is also modified in the same way.
To support this change a set of new macros
LIBFUZZER_LINUX and LIBFUZZER_APPLE has been defined which can be
used to check the target being compiled for.
Differential Revision: http://reviews.llvm.org/D20402
llvm-svn: 270145
This removes the subclasses of ProfileSummary, moves the members of the derived classes to the base class.
Differential Revision: http://reviews.llvm.org/D20390
llvm-svn: 270143
When matching an interleaved load to an ldN pattern, the interleaved access
pass checks that all users of the load are shuffles. If the load is used by an
instruction other than a shuffle, the pass gives up and an ldN is not
generated. This patch considers users of the load that are extractelement
instructions. It attempts to modify the extracts to use one of the available
shuffles rather than the load. After the transformation, the load is only used
by shuffles and will then be matched with an ldN pattern.
Differential Revision: http://reviews.llvm.org/D20250
llvm-svn: 270142
This splits ProfileSummary into two classes: a ProfileSummary class that has methods to convert from/to metadata and a ProfileSummaryBuilder class that computes the profiles summary which is in ProfileData.
Differential Revision: http://reviews.llvm.org/D20314
llvm-svn: 270136
This patch fixes https://llvm.org/bugs/show_bug.cgi?id=27703.
If there is a sequence of one or more load instructions, each loaded value is used as address of later load instruction, bitcast is necessary to change the value type, don't optimize it.
llvm-svn: 270135
This re-applies r270115.
Many of the MachO load commands can have data appended after the command structure. This data is frequently strings, but can actually be anything. This patch adds support for three optional fields on load command yaml descriptions.
The new PayloadString YAML field is populated with the data after load commands known to have strings as extra data.
The new ZeroPadBytes YAML field is a count of zero'd bytes after the end of the load command structure before the next command. This can apply anywhere in the file. MachO2YAML verifies that bytes are zero before populating this field, and YAML2MachO will add zero'd bytes.
The new PayloadBytes YAML field stores all bytes after the end of the load command structure before the next command if they are non-zero. This is a catch all for all unhandled bytes. If MachO2Yaml populates PayloadBytes it will not populate ZeroPadBytes, instead zero'd bytes will be in the PayloadBytes structure.
llvm-svn: 270124
Many of the MachO load commands can have data appended after the command structure. This data is frequently strings, but can actually be anything. This patch adds support for three optional fields on load command yaml descriptions.
The new PayloadString YAML field is populated with the data after load commands known to have strings as extra data.
The new ZeroPadBytes YAML field is a count of zero'd bytes after the end of the load command structure before the next command. This can apply anywhere in the file. MachO2YAML verifies that bytes are zero before populating this field, and YAML2MachO will add zero'd bytes.
The new PayloadBytes YAML field stores all bytes after the end of the load command structure before the next command if they are non-zero. This is a catch all for all unhandled bytes. If MachO2Yaml populates PayloadBytes it will not populate ZeroPadBytes, instead zero'd bytes will be in the PayloadBytes structure.
llvm-svn: 270115
Since the calls don't return, the instruction afterwards will never run,
and is just taking up unnecessary space in the binary.
Differential Revision: http://reviews.llvm.org/D20406
llvm-svn: 270109
A baby step toward translating DIType records to CodeView.
This does not (yet) combine the record length with the record data. I'm going back and forth trying to determine if that's a good idea.
llvm-svn: 270106
Previously, specifying -post-RA-scheduler=true had the side effect of
disabling the antidependency breaker, yielding different behavior than
if the post-RA-scheduler was enabled via the scheduling model.
Differential Revision: http://reviews.llvm.org/D20186
llvm-svn: 270077
It broke buildbot:
http://lab.llvm.org:8011/builders/clang-s390x-linux/builds/4817/steps/ninja%20check%201/logs/stdio
Actually it is just because D20273 not yet commited, but these 2 were crossing with each other,
and I`ll better find the way to land them separatelly soon.
Initial commit message:
[llvm-mc] - Teach llvm-mc to generate compressed debug sections in zlib style.
Before this patch llvm-mc generated zlib-gnu styled sections.
That means no SHF_COMPRESSED flag was set, magic 'zlib' signature
was used in combination with full size field. Sections were renamed to "*.z*".
This patch reimplements the compression style to zlib one as zlib-gnu looks
to be depricated everywhere.
Differential revision: http://reviews.llvm.org/D20331
llvm-svn: 270075
There are at least 2 places (DAGCombiner, X86ISelLowering) where this could be used instead
of ad-hoc and watered down code that is trying to match a power-of-2 pattern.
Differential Revision: http://reviews.llvm.org/D20439
llvm-svn: 270073
This patch changes the order in which we attempt to prove the independence of
strided accesses. We previously did this after we knew the dependence distance
was positive. With this change, we check for independence before handling the
negative distance case. The patch prevents LAA from reporting forward
dependences for independent strided accesses.
This change was requested in the review of D19984.
llvm-svn: 270072
Before this patch llvm-mc generated zlib-gnu styled sections.
That means no SHF_COMPRESSED flag was set, magic 'zlib' signature
was used in combination with full size field. Sections were renamed to "*.z*".
This patch reimplements the compression style to zlib one as zlib-gnu looks
to be depricated everywhere.
Differential revision: http://reviews.llvm.org/D20331
llvm-svn: 270070
Mask0Imm and ~Mask1Imm must be equivalent and one of the MaskImms is a shifted
mask (e.g., 0x000ffff0). Both 'and's must have a single use.
This changes code like:
and w8, w0, #0xffff000f
and w9, w1, #0x0000fff0
orr w0, w9, w8
into
lsr w8, w1, #4
bfi w0, w8, #4, #12
llvm-svn: 270063
Fixes for MUBUF_Atomic instructions to make operand list valid:
- For RTN insns, make a copy of $vdata_in operand as $vdata.
- Do not add operand for GLC, it is hardcoded and comes as a token.
Workaround to avoid adding multiple default optional operands.
Tests added.
Differential Revision: http://reviews.llvm.org/D20257
llvm-svn: 270049
Enable "Remove Redundant LEAs" part of the LEA optimization pass for -O2.
This gives 6.4% performance improve on Broadwell on nnet benchmark from Coremark-pro.
There is no significant effect on other benchmarks (Geekbench, Spec2000, Spec2006).
Differential Revision: http://reviews.llvm.org/D19659
llvm-svn: 270036
Transition InstrProf and Coverage over to the stricter Error/Expected
interface.
Changes since the initial commit:
- Fix error message printing in llvm-profdata.
- Check errors in loadTestingFormat() + annotateAllFunctions().
- Defer error handling in InstrProfIterator to InstrProfReader.
- Remove the base ProfError class to work around an MSVC ICE.
Differential Revision: http://reviews.llvm.org/D19901
llvm-svn: 270020
Summary:
Implement guard widening in LLVM. Description from GuardWidening.cpp:
The semantics of the `@llvm.experimental.guard` intrinsic lets LLVM
transform it so that it fails more often that it did before the
transform. This optimization is called "widening" and can be used hoist
and common runtime checks in situations like these:
```
%cmp0 = 7 u< Length
call @llvm.experimental.guard(i1 %cmp0) [ "deopt"(...) ]
call @unknown_side_effects()
%cmp1 = 9 u< Length
call @llvm.experimental.guard(i1 %cmp1) [ "deopt"(...) ]
...
```
to
```
%cmp0 = 9 u< Length
call @llvm.experimental.guard(i1 %cmp0) [ "deopt"(...) ]
call @unknown_side_effects()
...
```
If `%cmp0` is false, `@llvm.experimental.guard` will "deoptimize" back
to a generic implementation of the same function, which will have the
correct semantics from that point onward. It is always _legal_ to
deoptimize (so replacing `%cmp0` with false is "correct"), though it may
not always be profitable to do so.
NB! This pass is a work in progress. It hasn't been tuned to be
"production ready" yet. It is known to have quadriatic running time and
will not scale to large numbers of guards
Reviewers: reames, atrick, bogner, apilipenko, nlewycky
Subscribers: mcrosier, llvm-commits
Differential Revision: http://reviews.llvm.org/D20143
llvm-svn: 269997
Summary: Set default branch weight to 1:1 if one of the branch has profile missing when simplifying CFG.
Reviewers: spatel, davidxl
Subscribers: danielcdh, llvm-commits
Differential Revision: http://reviews.llvm.org/D20307
llvm-svn: 269995
- glibc is dynamically linked, and
- libgcc_s is unavailable (for instance, another library is being used to
provide the compiler runtime or libgcc is statically linked), and
- the target is x86_64.
If we run backtrace() and it fails to find any stack frames, try using
_Unwind_Backtrace instead if available.
llvm-svn: 269992
Having an enum member named Default is quite confusing: Is it distinct
from the others?
This patch removes that member and instead uses Optional<Reloc> in
places where we have a user input that still hasn't been maped to the
default value, which is now clear has no be one of the remaining 3
options.
llvm-svn: 269988
isReturn() was returning different values with and without -g which led to
different code being generated. Change isFlagSettingInstruction to query
an instruction's effect on SR instead.
llvm-svn: 269986
Use signed division otherwise all back jumps fail the check
Fixes regression introduced in r269951
Differential Revision: http://reviews.llvm.org/D20380
llvm-svn: 269972
When looking for an available spill slot, the register scavenger would stop
after finding the first one with no register assigned to it. That slot may
have size and alignment that do not meet the requirements of the register
that is to be spilled. Instead, find an available slot that is the closest
in size and alignment to one that is needed to spill a register from RC.
Differential Revision: http://reviews.llvm.org/D20295
llvm-svn: 269969
with an additional fix to make RegAllocFast ignore undef physreg uses. It would
previously get confused about the "push %eax" instruction's use of eax. That
method for adjusting the stack pointer is used in X86FrameLowering::emitSPUpdate
as well, but since that runs after register-allocation, we didn't run into the
RegAllocFast issue before.
llvm-svn: 269949
Use register class that does not include them when looking
for unallocated registers.
This is hit by the udiv v8i64 test in the opencl integer
conformance test, and takes a few seconds to compile in
a debug build so no test included.
llvm-svn: 269938
Don't expand divisions by constants if it would require multiple instructions.
The current assumption is that engines will perform the desired optimizations.
llvm-svn: 269930
Summary:
The ordering of registers in BinaryRRF instructions are wrong, and
affects the copysign instruction (CPSDR). This results in the wrong
magnitude and sign being set.
Author: zhanjunl
Reviewers: kbarton, uweigand
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D20308
llvm-svn: 269922
Summary:
MONITORX/MWAITX instructions provide similar capability to the MONITOR/MWAIT
pair while adding a timer function, such that another termination of the MWAITX
instruction occurs when the timer expires. The presence of the MONITORX and
MWAITX instructions is indicated by CPUID 8000_0001, ECX, bit 29.
The MONITORX and MWAITX instructions are intercepted by the same bits that
intercept MONITOR and MWAIT. MONITORX instruction establishes a range to be
monitored. MWAITX instruction causes the processor to stop instruction execution
and enter an implementation-dependent optimized state until occurrence of a
class of events.
Opcode of MONITORX instruction is "0F 01 FA". Opcode of MWAITX instruction is
"0F 01 FB". These opcode information is used in adding tests for the
disassembler.
These instructions are enabled for AMD's bdver4 architecture.
Patch by Ganesh Gopalasubramanian!
Reviewers: echristo, craig.topper, RKSimon
Subscribers: RKSimon, joker.eph, llvm-commits
Differential Revision: http://reviews.llvm.org/D19795
llvm-svn: 269911
MC only needs to know if the output is PIC or not. It never has to
decide about creating GOTs and PLTs for example. The only thing that
MC itself uses this information for is expanding "macros" in sparc and
mips. The rest I am pretty sure could be moved to CodeGen.
This is a cleanup and isolates the code from future changes to
Reloc::Model.
llvm-svn: 269909
In truncateToMinimalBitwidths() we were RAUW'ing an instruction then erasing it. However, that intruction could be cached in the map we're iterating over. The first check is "I->use_empty()" which in most cases would return true, as the (deleted) object was RAUW'd first so would have zero use count. However in some cases the object could have been polluted or written over and this wouldn't be the case. Also it makes valgrind, asan and traditionalists who don't like their compiler to crash sad.
No testcase as there are no externally visible symptoms apart from a crash if the stars align.
Fixes PR26509.
llvm-svn: 269908
It defined the LLVM_AVR_GCC_COMPAT constant, which would enable/disable
certain GCC-specific behaviours.
There is no point conditionally turning it on/off, as it will always be
turned on, and we have to maintain both code paths anyway.
llvm-svn: 269904
Restrict the creation of compact branches so that they do meet the ISA
requirements. Notably do not permit $zero to be used as a operand for compact
branches and ensure that some other branches fulfil the requirement that
rs != rt.
Fixup cases where $rs > $rt for bnec and beqc.
Recommit of rL269893 with reviewers comments.
Reviewers: dsanders, vkalintiris
Differential Review: http://reviews.llvm.org/D20284
llvm-svn: 269899
Restrict the creation of compact branches so that they meet the ISA encoding
requirements. Notably do not permit $zero to be used as a operand for compact
branches and ensure that some other branches fulfil the requirement that
rs != rt.
Fixup cases where $rs > $rt for bnec and beqc.
Reviewers: dsanders, vkalintiris
Differential Review: http://reviews.llvm.org/D20284
llvm-svn: 269893
This change adds support for software floating point operations for Sparc targets.
This is the first in a set of patches to enable software floating point on Sparc. The next patch will enable the option to be used with Clang.
Differential Revision: http://reviews.llvm.org/D19265
llvm-svn: 269892
Coverage mapping data is organized in a sequence of blocks, each of which is expected
to be aligned by 8 bytes. This feature is used when reading those blocks, see
VersionedCovMapFuncRecordReader::readFunctionRecords(). If a misaligned covearge
mapping data has more than one block, it causes llvm-cov to fail.
Differential Revision: http://reviews.llvm.org/D20285
llvm-svn: 269887
* Reworks the CVSymbolTypes.def to work similarly to TypeRecords.def.
* Moves some enums from SymbolRecords.h to CodeView.h to maintain
consistency with how we do type records.
* Generalize a few simple things like the record prefix
* Define the leaf enum and the kind enum similar to how we do with tyep
records.
Differential Revision: http://reviews.llvm.org/D20342
Reviewed By: amccarth, rnk
llvm-svn: 269867
I don't yet fully understand the meaning of these data strcutures,
but at least it seems that their sizes and types are correct.
With this change, we can read publics streams till end.
Differential Revision: http://reviews.llvm.org/D20343
llvm-svn: 269861
We currently don't represent get_local and set_local explicitly; they
are just implied by virtual register use and def. This avoids a lot of
clutter, but it does complicate stackifying: get_locals read their
operands at their position in the stack evaluation order, rather than
at their parent instruction. This patch adds code to walk the stack to
determine the precise ordering, when needed.
llvm-svn: 269854
instead of having DwarfUnit query the debugger tuning options.
Follow-up commmit to r269827.
Thanks to Paul Robinson for pointing this out!
llvm-svn: 269840
This patch moves the expansion of WIN_ALLOCA pseudo-instructions
into a separate pass that walks the CFG and lowers the instructions
based on a conservative estimate of the offset between the stack
pointer and the lowest accessed stack address.
The goal is to reduce binary size and run-time costs by removing
calls to _chkstk. While it doesn't fix all the code quality problems
with inalloca calls, it's an incremental improvement for PR27076.
Differential Revision: http://reviews.llvm.org/D20263
llvm-svn: 269828
As discovered in PR27758, GDB does not fully support the DWARF 4 format.
This patch ensures we always emit bitfields in the DWARF 2 when tuning for GDB.
llvm-svn: 269827
When processing inline asm that contains errors, make sure we can recover
gracefully by creating an UNDEF SDValue for the inline asm statement before
returning from SelectionDAGBuilder::visitInlineAsm. This is necessary for
consumers that don't exit on the first error that is emitted (e.g. clang)
and that would assert later on.
Fixes PR24071.
Patch by Diana Picus.
llvm-svn: 269811
This adds support for all the MachO *_command structures. The load_command payloads still are not represented, but that will come next.
llvm-svn: 269808
This bug was introduced in r269728 and is the likely cause of many stage 2 ubsan bot failures.
I'll add a test in a follow-up commit assuming this fixes things properly.
llvm-svn: 269797
... for AddRec's in loops for which SCEV is unable to compute a max
tripcount. This is the NUW variant of r269211 and fixes PR27691.
(Note: PR27691 is not a correct or stability bug, it was created to
track a pending task).
llvm-svn: 269790
when the object is in an archive to use something like libx.a(foo.o) as part of
the error message.
Also changed llvm-objdump and llvm-size to be like llvm-nm and ignore non-object
files in archives and not produce any error message.
To do this Archive::Child::getAsBinary() was changed from ErrorOr<...> to
Expected<...> then that was threaded up to its users.
Converting this interface to Expected<> from ErrorOr<> does involve
touching a number of places. To contain the changes for now the use of
errorToErrorCode() is still used in one place yet to be fully converted.
Again there some were bugs in the existing code that did not deal with the
old ErrorOr<> return values. So now with Expected<> since they must be
checked and the error handled, I added a TODO and a comments for those.
llvm-svn: 269784
This adds support for all the MachO *_command structures. The load_command payloads still are not represented, but that will come next.
llvm-svn: 269782
This just checks that we emit all type records once, and then after
merging the type stream with no other type streams, we still emit every
kind of type record.
We could test the dumper output more closely, but that would make the
test very brittle. Currently we're just getting coverage.
llvm-svn: 269778
Since r207518 they are printed exactly like non-hidden stubs on x86 and
since r207517 on ARM.
This means we can use a single set for all stubs in those platforms.
llvm-svn: 269776
Summary:
Add support to control where files for a distributed backend (the
individual index files and optional imports files) are created.
This is invoked with a new thinlto-prefix-replace option in the gold
plugin and llvm-lto. If specified, expects a string of the form
"oldprefix:newprefix", and instead of generating these files in the
same directory path as the corresponding bitcode file, will use a path
formed by replacing the bitcode file's path prefix matching oldprefix
with newprefix.
Also add a new replace_path_prefix helper to Path.h in libSupport.
Depends on D19636.
Reviewers: joker.eph
Subscribers: llvm-commits, joker.eph
Differential Revision: http://reviews.llvm.org/D19644
llvm-svn: 269771
This is assertion is no longer necessary since we never record
constants in the live set anyway. (They are never recorded in
the initial live set, and constant bases are removed near line 2119)
Differential Revision: http://reviews.llvm.org/D20293
llvm-svn: 269764
The movw instruction is only available in ARM state for V6T2 and above.
The MOVi16 instruction has requirement HasV6T2 but the InstAlias
for mov rd, imm where the operand is imm0_65535_expr:$imm does not.
This means that movw can incorrectly be used in ARMv4 and ARMv5 by
writing mov rd, 0x1234. The simple fix is to the requirement HasV6T2
to the InstAlias. Tests added to not-armv4.s.
Patch by Peter Smith.
llvm-svn: 269761
This patch adds the commandline option -mips-compact-branches={never,optimal,always),
which controls how LLVM generates compact branches for MIPS targets. By
default, the compact branch policy is 'optimal' where LLVM will (hopefully)
pick the optimal branch for any situation. The 'never' policy will disable
the generation of compact branches and 'always' will generate compact branches
wherever possible.
Reviewers: dsanders
Differential Review: http://reviews.llvm.org/D20167
llvm-svn: 269753
PrologEpilogInserter has these 3 phases, which are related, but not
all of them are needed by all targets. This patch reorganizes PEI's
varous functions around those phases for more clear separation. It also
introduces a new TargetMachine hook, usesPhysRegsForPEI, which is true
for non-virtual targets. When it is true, all the phases operate as
before, and PEI requires the AllVRegsAllocated property on
MachineFunctions. Otherwise, CSR spilling and scavenging are skipped and
only prolog/epilog insertion/frame finalization is done.
Differential Revision: http://reviews.llvm.org/D18366
llvm-svn: 269750
MachineInstr::isSafeToMove is more conservative than is needed here;
use a more explicit check, and incorporate knowledge of some
WebAssembly-specific opcodes.
llvm-svn: 269736
The DWARF spec states that a member entry may have either a
DW_AT_data_member_location or a DW_AT_data_bit_offset, but not both.
This fixes a bug found in PR 27758.
llvm-svn: 269731
Fix a bug introduced with rL269426 :
[InstCombine] canonicalize* LE/GE vector integer comparisons to LT/GT (PR26701, PR26819)
We were assuming that a ConstantDataVector / ConstantVector / ConstantAggregateZero operand of
an ICMP was composed of ConstantInt elements, but it might have ConstantExpr or UndefValue
elements. Handle those appropriately.
Also, refactor this function to join the scalar and vector paths and eliminate the switches.
Differential Revision: http://reviews.llvm.org/D20289
llvm-svn: 269728
This code currently relies on static methods in ProfileSummary to determine whether a function is hot or unlikley. I am refactoring the ProfileSummary code and these methods will be removed. As discussed offline, the right way to re-introduce this is to add a pass to annotate functions with unlikely/hot hints and use the hints to determine the prefix here.
llvm-svn: 269726
The DWARF spec clearly states that a bit field member should have either a
DW_AT_byte_size or a DW_AT_bit_size, but not both.
Also the DW_AT_byte_size is redundant with the size of the type of the member.
This fixes a bug found in PR 27758.
llvm-svn: 269714
This was assuming it could use all memory before, which is
a bad decision because it restricts occupancy.
By default, only try to use enough space that could reduce
occupancy to 7, an arbitrarily chosen limit.
Based on the exist LDS usage, try to round up to the limit
in the current tier instead of further hurting occupancy.
This isn't ideal, because it doesn't accurately know how much
space is going to be used for alignment padding.
llvm-svn: 269708
In practice only a few well known appending linkage variables work.
Currently if codegen sees an unknown appending linkage variable it will
just print it as a regular global. That is wrong as the symbol in the
produced object file has different semantics as the one provided by the
appending linkage.
This just errors early instead of producing a broken .o.
llvm-svn: 269706
Allow two users of the condition if the other user
is also a min/max select. i.e.
%c = icmp slt i32 %x, %y
%min = select i1 %c, i32 %x, i32 %y
%max = select i1 %c, i32 %y, i32 %x
llvm-svn: 269699
Summary:
Fix bug in MachO path where a frame index offset would not be reserved
for handling large frames when an extra non-used callee-save register
was saved. In the case where the extra register is reserved or not a
GPR (e.g. %FP in the MachO case), this would lead to the register
scavenger later failing when called from PrologEpilogInserter.
Reviewers: t.p.northover
Subscribers: aemerson, rengolin, mcrosier, llvm-commits
Differential Revision: http://reviews.llvm.org/D20185
llvm-svn: 269697
Transition InstrProf and Coverage over to the stricter Error/Expected
interface.
Changes since the initial commit:
- Address undefined-var-template warning.
- Fix error message printing in llvm-profdata.
- Check errors in loadTestingFormat() + annotateAllFunctions().
- Defer error handling in InstrProfIterator to InstrProfReader.
Differential Revision: http://reviews.llvm.org/D19901
llvm-svn: 269694
Summary: On Linux, /usr/include/bits/byteswap-16.h defines __byteswap_16(x) as an inlined LRVH (Load Reversed Half-word) instruction. The SystemZ back-end did not support this opcode and the inlined assembly would cause a fatal error.
Reviewers: bryanpkc, uweigand
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D18732
llvm-svn: 269688
This is a compile time optimization: keeping a large file to process
at the end hurts parallelism.
The heurisitic used right now is the input buffer size, however we
may want to consider the number of functions to import or the
different number of files to load for importing as well.
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 269684
This is reducing pressure on the OS memory system, and is NFC
when not using a cache.
I measure a 10x memory consumption reduction when linking opt
with full debug info.
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 269682
The new X86 shuffle lowering can do just fine without transforming vselects
into vector_shuffles. It looks like the only thing this code does right now
is cause trouble - in particular, it can lead to combine/legalization infinite
loops.
Note that it's not completely NFC, since some of the shuffle masks get inverted,
which may cause slight differences further down the line. We may want to find
a way to invert those masks, but that's orthogonal to this commit.
This fixes the hang in PR27689.
llvm-svn: 269676
This patch renames the option enabling the store-to-load forwarding conflict
detection optimization. This change was requested in the review of D20241.
llvm-svn: 269668
Also s/Cycles/Iters/ in NumCyclesForStoreLoadThroughMemory to make it
clear that this is not about clock cycles but loop cycles/iterations.
llvm-svn: 269667
The selection of the vectorization factor currently doesn't consider
interleaved accesses. The vectorization factor is based on the maximum safe
dependence distance computed by LAA. However, for loops with interleaved
groups, we should instead base the vectorization factor on the maximum safe
dependence distance divided by the maximum interleave factor of all the
interleaved groups. Interleaved accesses not in a group will be scalarized.
Differential Revision: http://reviews.llvm.org/D20241
llvm-svn: 269659
Without a diagnostic handler installed, llc's behaviour is to exit on the first
error that it encounters. This is very different from the behaviour of clang
and other front ends, which try to gather as many errors as possible before
exiting.
This commit adds a diagnostic handler to llc, allowing it to find and report
more than one error. The old behaviour is preserved under a flag (-exit-on-error).
Some of the tests fail with the new diagnostic handler, so they have to use the
new flag in order to run under the previous behaviour. Some of these are known
bugs, others need further investigation. Ideally, we should fix the tests and
remove the flag at some point in the future.
Reapplied after fixing the LLDB build that was broken due to the new
DiagnosticSeverity in LLVMContext.h, and fixed an UB in the new change.
Patch by Diana Picus.
llvm-svn: 269655